Why We Need More Than Bluetooth Data to Fight Covid-19

--

By Dr. Cyrus Shahabi

In order to stop the pandemic in its tracks, we need a nuanced risk assessment that takes many factors into consideration, including people’s mobility patterns.

Image by George Peters/iStock

Digital contact-tracing, along with testing, are pretty much the only solutions offered to gradually relax stay-at-home orders, allowing us to get back to our daily lives. That is, until either the virus disappears by itself or we, either naturally or through vaccination, reach herd immunity.

A recent effort by Apple and Google to collect co-location data, which would enable large-scale contact-tracing in real-time using Bluetooth-based proximity detection, is a step at the right direction. The fact that both companies are working together for this purpose is commendable.

But it’s not enough.

In order to stop the pandemic in its tracks, we need a nuanced risk assessment that takes many factors into consideration, including people’s mobility patterns.

Let me start by explaining the limitations of the current Google and Apple approach.

First and foremost, due to the relatively wide-range reach of Bluetooth devices — including signal penetration through walls, ceilings and floors — this approach would result in many false positives.

Two people can be on different floors in a building and still be considered “co-located” according to their Bluetooth devices, just as Bluetooth speaker or headsets can pick up music played on your phone from nearby rooms.

You’re unlikely to be directly infected by someone who you never meet face-to-face, but your phone won’t know this using Bluetooth alone. We need more data (for instance, your social relationship to someone, inferred from previous frequent encounters) to determine who you have directly come into contact with, versus someone who just happens to be in the next hotel room, for instance.

In addition, by design, the proposed approach would alert anyone who has been in contact with a confirmed case, including these false positives. This method may be effective now, while we are all at home.

But as soon as we all go back to work and school, we may each receive several alerts per day, to the point that we turn numb to them.

We witnessed the same effect in the past with car theft alarms and tornado and storm warnings in some parts of the country, with disastrous consequences.

Finally, this approach will not identify cases of environmental transmissions — you can, after all, become infected by touching surfaces that have been contaminated by a person who is long gone, along with their Bluetooth’s signal. Moreover, with this approach, we can only identify direct, person-to-person transmissions.

For instance, consider the following scenario. Alice infects Bob. Bob, who is now infected but pre-symptomatic, comes into contact with David three days later. With the proposed approach, we cannot alert David until much later, when Bob develops symptoms and becomes confirmed.

Graphic by Haotian Mai/USC.

Detecting these indirect transmissions by pre-symptomatic people, who have been infected and are incubating the virus, is critical to stop the spread of COVID-19.

We still need the Bluetooth data to understand people’s proximity to one another. But collecting additional data, in particular, contact location (by using the phone’s GPS, for instance), can alleviate or address all the shortcomings described above.

Using a hybrid approach of collecting both locations and co-locations has various advantages.

First, instead of a binary alert that you were either in contact with a confirmed person or not, your app can show you — and only you — a “risk score” without constant alerts, so we are not overwhelmed or alarmed unnecessarily.

This score can be computed using various AI approaches that crunch all the data about the duration of your contacts, the popularity of contact locations, the risk score of the contact person and of the other people in that location, whether the contact was a one-off case with a stranger or someone with previous frequent contacts, and so forth.

Imagine something like a private FICO score test. But unlike a FICO score, this aggregate score does not need to be released to anyone. As you move on with your life, this score on your private app will be updated. And we hope those with higher scores would have the good faith or conscience to stay home.

Locations and destinations can also have risk scores, aggregated from the same underlying data. You can use the location scores to decide whether you should go to a location or not (for instance, postpone shopping when your grocery store is crowded with high-risk individuals) and use your risk score to decide whether you should get tested or not.

Revealing these scores publicly — without any identifying attribute, to protect privacy — can be used to identify hotspots, understand spread patterns and help inform the statistical predictive models.

Privacy has been the main justification for not collecting location data for contact-tracing. The challenge tech companies face, and what causes their hesitancy, is public trust. However, we knowingly trust companies to track our location all the time-whether to shave a few minutes off our commute or receive a coupon for our next Starbucks stop.

So, why not trust these companies with our location data to stop a pandemic? Now is an opportune time to put to work technology, which we as consumers already use for convenience, to save lives.

As someone who has spent many years of my career researching location privacy, I also want to provide reassurance.

There is a decade of research that can be incorporated, starting from simple measures of allowing users to control the specificity of their reported locations (for instance, building level or shopping-center level), the frequency of reporting (for example, once or twice a day) and to remove sensitive locations.

More sophisticated measures and technologies can also be incorporated, such as storing and searching all data in an encrypted form, similar to storing passwords or banking information. Both of these approaches have been made bulletproof by math for location data in the past decade.

Collecting location data would also help update our epidemic modeling and planning since the current models tend to be inaccurate.

One model predicted a minimum of 100,000 deaths in three weeks, even with social distancing. That’s because they don’t have detailed data; they can’t track mobility patterns and take stay-home orders into consideration; and the models are generic, or independent of specific geographical locations.

As an analogy, we are forecasting the spread of coronavirus the same way we did traffic prediction 15 years ago without data versus what we can do now with Waze and Google, where we have detailed location-specific mobility data.

And without accurate data and implementing important tech solutions, we run the risk of rushing back to our daily lives without precautions, causing stress to our healthcare system and greater risk to our population all over again.

Hopefully, these data-driven solutions move us beyond the solution offered 100 years ago for the Spanish flu: stay home and wash your hands.

Cyrus Shahabi is a professor of computer science, electrical engineering and spatial sciences, and the chair of USC’s Department of Computer Science.

USC’s Department of Computer Science publishes a broad range of stories and research. The views and opinions expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the University of Southern California.

Professor Shahabi receives consulting income from Google for an unrelated project.

--

--

USC Department of Computer Science

Leading research and commentary from the University of Southern California’s Department of Computer Science.