Profiling presenting symptoms of patients screened for SARS-CoV-2

--

Alison Callahan*, Jason A. Fries*, Saurabh Gombar, Birju Patel, and Nigam H. Shah (*equal contributors)

There is high interest in characterizing the presenting symptoms of individuals with COVID-19 to inform diagnosis and triage decisions as well as identify patients at risk of serious complications. As one of the many efforts in Stanford Medicine’s data science response to the current pandemic, we developed a text processing system to identify clinical observations in the notes written by care providers when screening patients for COVID-19.

Overview of how data science efforts support different information needs for COVID19 response

The text processing pipeline is built using Stanford’s Snorkel framework, a system for rapidly training machine learning models using noisy rules, and a custom library for classifying common clinical observations mentioned in patient notes. Our pipeline identifies the context of clinical observations, including the subject of the observation (for example, the patient, or a family member), the temporal period of the observation (whether it is a description of a current or past state), where in the clinical note it occurs (for example, history of present illness, care plan, etc.), as well as observations such as smoking status or recent travel history.

We analyzed the emergency department (ED) assessment notes of 895 patients screened and tested for SARS-CoV-2 at Stanford Health Care, as of March 31, 2020. Mentions of related clinical observations are grouped into more general categories (e.g. ‘shortness of breath’, ‘sob’, ‘dyspnea’ are grouped into ‘dyspnea’). We then quantified the frequency of clinical observations in the records of those who tested positive or negative for SARS-CoV-2. From these frequencies we computed the conditional probabilities of a positive or negative test result given the mention of certain observations in these notes i.e. P(+ve|observation) and P(-ve|observation). The result is a symptom profile of the patients screened and tested for SARS-CoV-2.

No single presenting symptom or observation differentiates those positive for SARS-CoV-2, implying that presenting symptoms may not be sufficient to reliably diagnose a patient with COVID-19. However, such information can assist triage decisions as well as inform data-driven design of symptom tracking surveys.

Therefore, we make this symptom profile available as a public resource to assist multiple symptom surveying and COVID-19 symptom tracking efforts underway. The table here lists these frequencies and corresponding conditional probabilities for the top 50 most mentioned clinical observations.

--

--