Acoustic Syndromic Surveillance Using Phones

Angelique Laaks
Acoustic Epidemiology
7 min readMay 13, 2020

What is syndromic surveillance?

From an epidemiological perspective, a single cough, from a single person, means very little. After all, it’s not uncommon to cough as many as dozens of times per day. Because cough is so common, even among the healthy, it doesn’t raise eyebrows. This is why you don’t get worried when you hear someone cough once in a movie theatre or a classroom — because it’s normal.

But what about when a cough is abnormal? What if during the opening credits of a film you were to hear not one or two, but ten or twenty coughs, coming from multiple people? At what point might you be worried about the health of the people in that theatre?

In the case of infectious respiratory disease, symptoms nearly always occur before diagnosis. And, depending on the population’s characteristics (access to health care, etc.), symptoms often occur without a diagnosis. For these reasons, monitoring symptoms (rather than waiting for diagnoses) can tell public health practitioners about where and when an outbreak is occurring before diseases begin to be reported by medical professionals. And as we’ve all learned firsthand in the COVID-19 crisis, rapid detection — identifying clusters of illness early — is crucial to mobilizing a rapid response, which in turn prevents further infection.

It’s called syndromic surveillance: collecting and analyzing health indicators in real-time so as to identify anomalies more quickly than traditional disease reporting allows. The more quickly a threat to public health is identified, the more effective public health action can be taken.

The data used for syndromic surveillance are oftentimes administrative: school absenteeism logs, the register of symptoms from an emergency room’s triage nurse, the frequency of certain laboratory tests being ordered, etc. But like sources of traditional disease reporting, syndromic surveillance data sources are often biased by selection, access to healthcare, geographic reach, etc. And though syndromic surveillance is faster than traditional disease reporting, it’s not quite as fast as disease contagion: by the time someone’s illness causes absenteeism or an emergency room admission, it’s likely that he or she has been symptomatic (and perhaps contagious) for days. And syndromic surveillance systems’ reliance on administrative data means that they can only cover areas where administrative data is prevalent and timely: areas without emergency rooms, digitized school absenteeism logs, or centralized laboratory tests registers can’t effectively carry out syndromic surveillance.

Sound as a symptom

An effective syndromic surveillance system need to be fast and have wide coverage. Ideally, it should monitor symptoms themselves as they emerge, not as treatment is sought (which inevitably comes later and only applies to a small minority of the population). Because geographic clustering is important to disease outbreaks, it should take into account trends in symptoms not only time but also over space. Ideally, it would use data sources which are consistent and interoperable so that effective comparisons can be made between different areas.

We think we can use sound.

Specifically, we believe that monitoring cough frequency over space and time can generate meaningful, real-timely, actionable data on the health of a population.

We call this “acoustic surveillance”. It means (1) listening to sounds; (2) distinguishing between coughs and non-coughs; (3) aggregating the data and algorithmically detecting abnormalities in trends; (4) taking public health action when a significant increase in cough frequency is detected in an area.

Importantly, the tools for acoustic surveillance already exist and are already widely deployed. Virtually all smartphones have the capability to register sound, and machine learning algorithms are sufficiently advanced to differentiate between coughs and non-coughs. In other words, not only are the hardware and software ready for acoustic surveillance — they’re already widely distributed across different geographic areas. It’s simply a question of “turning on” the system — getting individuals to allow their coughs to be monitored and analyzing that data.

A walk-through

Consider Mary. She uses a cough-tracking app on her phone and is pretty familiar with her own cough “baseline” — that is, she coughs about 10 times a day on average.

Having established her “baseline”, it’s fairly clear to see from when she got a cold. For several days in early March, she was coughing at more than double her normal rate.

Pretty interesting, right? Sure, but not to public health authorities: they’ve got thousands of Mary’s to monitor, and are relatively uninterested in one person’s case of the sniffles.

So what if instead of just looking at Mary, we looked at, say, a dozen people.

In the above, we see a great deal of variance. There are some chronic coughers and some infrequent coughers, and there are also a few spurious cases of spikes in cough, like Mary’s. But what’s important to notice in the above is that Mary’s early-March spike in coughs was not part of a larger disease trend: on the whole, the trend (blue line) for this small population remained fairly constant.

In a visualization like the above, it’s easy to lose the forest for the trees. In a real syndromic surveillance system, there’s no reason to monitor every individual, every day. Rather, we aggregate data so as to get a better understanding of overall trends. The below shows the average daily coughs in the population (black), a smoothed trend (orange), and a range of “normality” (blue).

Of course, trends over time only cover one dimension. There’s also space. And no syndromic surveillance system would be complete without a geographic component.

Let’s take the below, 8-district, hypothetical territory. Each point, we’ll say, is a household (1,000 in total).

For the sake of this example, we’ll assume this is a rural, developing area with relatively even distribution of population and low access to healthcare services (hence, difficult to carry out syndromic surveillance via practitioners). Let’s also assume that 50% of households have a smartphone, and 10% of households have the cough-tracking app installed. So, that’s a total of 100 households (red) whose coughs are being tracked.

With acoustic surveillance, we can monitor coughs over both time and space. Take the below animation, for example, in which the number of daily coughs for each household is reflected in the point size.

gain, this level of detail is too much for an effective acoustic syndromic surveillance system. From a public health perspective, we’re unconcerned about minor variations over time, or differences in cough frequency between specific households. After all, a household of smokers might have a significantly different cough pattern than one of non-smokers. By the same token, age, phone use, device idiosyncrasies, and household size might all be factors that explain most of the variance in cough frequency.

The below shows household-specific cough frequencies, grouped by district. Each black line is a household, and the blue line is a smoothed trend.

Again, a bit too noisy. We’re interested in the big picture: anomaly detection. We can establish area-specific baseline ranges based on the previously observed data — and when a cough frequency for an area exceeds the expected range, our acoustic surveillance picks up on the change and triggers an alert.

The below shows the average number of household coughs (black dots), a smoothed (blue) trend line for each district, and an area of “within normal”, calculated based on the district’s previous month’s coughs.

Public health practitioners monitoring these data would take note of the anomalous trends in district 1 and might dispatch a team to carry out an investigation, testing, or whatever the context called for.

More than just coughs per day

In the above example, we’ve gone over how tracking the average number of daily coughs over time and space could be useful for syndromic surveillance at a population level. But the aggregated number of daily coughs is just a start. Acoustic surveillance could do much more.

A comprehensive acoustic surveillance system would both (a) monitor coughs and (b) learn from them. That is, the same data that triggers the anomaly alert would also form part of a feedback loop; with time, the simple “alert” system could take into account factors such as the distribution of coughs throughout the day (think night-time coughing vs. day-time coughing) and the relative dispersion vs. concentration of coughs (coughing “fits” or episodes vs. a uniform distribution throughout the day). With greater app usage, the surveillance system would improve both in terms of accuracy (fewer false alarms) and timeliness (anomalous departures from baseline detected sooner), but also in terms of granularity (smaller catchment areas).

Originally published at https://blog.hyfeapp.com on May 13, 2020.

--

--