Quality Control: COVID-19

Peter Sun
1Point3Acres Worklog
7 min readApr 24, 2020

--

How do we keep accurate and timely records of 906,975 COVID-impacted individuals…and counting.

Photo credit: MPIX on Shutterstock

This work is co-authored by Lin Zuo and Peter Sun, volunteers at the 1p3a coronavirus tracker. See our bios at the end of the article — we are both proud alumni of Duke University!

How hard is it to deal with COVID-19 data?

Imagine you are a banker teller. Your job is to take in banknotes from a long queue of customers. Upon receiving the Franklins from a customer, you do the following:

  1. Check if they are complete (no one finds half a Franklin or a headline without new patient increase useful)
  2. Verify you did not get a counterfeit (big trouble for locking fake Franklins in the box, or fake numbers in the database)
  3. Punch the right numbers and details under the right account (putting too many Franklins under John Smith is just as bad as putting too few…more so for COVID-19 cases)
  4. Check the balance with the bank’s headquarters at the end of the day (Did we match the official patient count in each state? Can we explain the differences?)
We wish we could count slower than this — when COVID-19 is no longer spreading so fast

Now imagine the queue is 300-person long (the daily average number of updates we make to the database), the stack of notes you count grows 10x every 5–8 days (like the growth of North American cases in March), and your can only talk to colleagues via earphones (our team is completely remote — see how we manages it in this previous article).

And don’t forget: you are a star bank teller on live broadcast! Your audience: cumulatively 198,960,350 visits around the globe. Also in the VIP audience are CDC (the Situation Awareness Team to be specific), Johns Hopkins University (whose North America data is pulled from us), the United Nations, FEMA…

Now imagine making a mistake.

Except the consequences of making a mistake that we cannot trace about COVID-19 data is much more serious than failing to trace a fake Franklin. Every number is a person like you and me. The general public relies on us to see the full picture. Analysts, researchers, and decision-makers rely on this data to fight the pandemic.

We must guarantee the accuracy and traceability of each data entry as the data explodes in volume and complexity. There is no other option in face of a social responsibility this heavy.

In this second article (first here), we are taking you onto a tour about how we developed the quality control practice at the 1point3acres Coronavirus Tracker & Dashboard.

Hint: this is not a technological saga. It is a story where simple tech solutions enabled process improvement that boosted operational efficiency.

When a QC procedure needs QC

One of the most important aspects of our quality control is to maintain data consistency with official records. Every day we compare our cases, fatalities, and recoveries with county and state health departments around the US and Canada to ensure we can match their records by the end of the day. For any data inconsistency, we need to be able to explain it as a result due to lags in official updates, reporting definitions or other valid causes.

Fig 1. Being one of the non-government aggregators closest to data means significant responsibilities to be reliable in real time

In the Stone Age (late February as COVID-19 first picked up in North America) of the 1p3a tracker, we simply added a column in our database hosted on Airtable. At the end of each day, if our data aligns with that on the official websites, volunteers punched the equivalent of “Consistent 11.51pm ET”. Otherwise, volunteers noted down the exact differences such as “We have one more case in Durham County”, and explained why our data is credible (“A local media just reported a more updated number but the official website hasn’t updated the number yet”).

Only if it worked as our Stone Age volunteers hoped! While everyone gave a Slack thumb up to the importance of this new QC practice, we soon realized that only half of the states were stamped by the volunteers everyday. Half of the new column was either empty or outdated.

Half-done QC is basically no QC. Since this procedure was not executed completely, we once encountered situations where the data of a certain state was not updated for two days. In late February, the situation in many North American communities hadn’t fully evolved, so several incidents like this slipped under the radar of our keen users who were laser-focused on places such as Washington State.

However, just because our visitors did not flag these issues does not mean we can turn the other way. These incidents are not due to unprecedented challenges, but a flawed procedure that was put in place to avoid exactly these. If we were to check all the states daily, we could have avoided these hiccups.

Procedures don’t work themselves out. People work them out.

We asked around volunteers about their experiences with the QC procedure. Dozens of Slack messages later, we came back with three key sore spots:

  1. People forgot: there was no reminder to conduct QC. Volunteers were overwhelmed by the amount of new case reports, further patient details, and headlines submitted by users and could easily forget to QC during their shifts.
  2. People didn’t know where to start: It was impossible to see the list of states that needed to be checked given the database configuration. The “note” field was a text field and everyone had a different style of logging the QC.
  3. People felt uncertain about the workload: some state official websites updated frequently and served as our go-to sources, but others suffered from significant delays, making QC unpredictable and frustrating. Even though volunteers noticed that QC was not completed for some states, they felt overwhelmed by not knowing how long it would take to check the data and thus didn’t bother to take them up.
Fig 2. Our initial QC procedure was flawed as it overlooked the human factor

The biggest takeaway — it is not because people dismiss the importance of a procedure that they refused to follow it. It is because there wasn’t an intuitive, efficient and accommodating workflow that encourages them to follow suit.

On this note, we embarked on making sure our volunteers are supported through this important process.

Ticking the Human Box

We worked with our dev lead to find a solution. It turned out four simple tweaks were all that stood between us and an optimal solution:

  1. We standardized the practice: we created a new table “Daily Check” in the Airtable solely for the purpose of hosting QC results. The QC stamps by volunteers were now standardized to five columns, including the date, the volunteer’s name, the state, the type of cases checked (confirmed/death/recovery) and comments on any discrepancy
  2. We made it easy to spot QC to be done: the latest entry of a state in “Daily Check” is then linked to and displayed in the main database next to the corresponding state. A simple filter takes a volunteer to see states that are not yet checked today
  3. We anticipated difficulty levels of a QC task: a difficulty level is assigned to each state to advise all volunteers on the time expected to spend on the task. This guidance is based on a series of metrics such as the number of counties, the cumulative confirmed cases, how difficult is it to use the official website, and any complaints from volunteers who checked this state before
  4. We launched an automated reminder: we coded up a CICD program to automatically filter the states that have yet to be checked. It then pins a new Github ticket along with difficult levels and data sources to remind volunteers
Fig 3. A mix of technical and procedural fixes addressed the human factor and significantly improved reliability

Though not impeccable, the new procedure boosted the QC completion rate from 50% to north of 80% within days. With the help of our keen users and the improving data practices of the authorities, our database remains one of the most accurate today in North America without sacrificing efficiency as the amount of data continues to increase.

One prevalent joke that never dies in our team is that our database is an “artificial intelligence”: it is intelligent thanks to the procedures, streamlining and automation, but by the end of the day it’s driven by people. Perhaps it has realistic virtue beyond amusement: never forget the people in building a procedure.

Stay tuned for more insights, reflections and most importantly, up-to-date COVID-19 resources on one site at https://coronavirus.1point3acres.com/en, and follow our Twitter at @1p3adev!

About the Co-authors:

Lin is a recent graduate from Duke University. She has spent most of her post-graduation time on the 1p3a Coronavirus Tracker to do her part in the pandemic.

Peter is a recent graduate from Duke University. He joined the 1p3a Coronavirus Tracker when spring break fell apart. Peter now volunteers in the data team, social media operations and the 1p3a Worklog.

--

--