Is the Google / Apple contact tracing solution private enough?

With Google and Apple now providing critical technological support for contact tracing applications, it’s important that we understand the implications — both good and potentially risky — of their approach to individual privacy.

Anne Fauvre-Willis
Oasis Labs
Published in
6 min readApr 27, 2020

--

Amidst the call for contact tracing applications and the underlying data to identify, track and notify populations of newly surging COVID19 outbreaks, Apple and Google announced a joint effort to curb the spread of the disease without invading individual user privacy. Specifically, the two companies will release an API that will allow mobile apps to access bluetooth proximity data on Android and iOS mobile phones. With a collective user base of 3B+ the benefit this could provide to contact tracing efforts is substantial.

Yet while this is no doubt the noble effort of two giants to leverage their strengths for a common good, it is still important to ensure that we as consumers understand how it works. After all — it’s our data that feeds their network and ultimately our collective health and well-being that depends on it.

Below I lay out a few questions that we’ve seen pop up regarding the solution — and highlight some of the potential solutions and gaps that may exist with Google and Apple’s approach. This is by no means meant to be a rejection of the Google/Apple approach — I, like many, believe that they have the very best of intentions in mind, but simply to inform and call to note some areas of their approach that may benefit from refinement.

How does it work?

Before we map out these questions, let’s explain exactly how the Google / Apple approach works. The companies will roll out these updates in two phases over the next few months:

A great diagram from the Financial Times on how this data is collected and used is below.

Outstanding Questions about the implementation

Now that we have a general sense of the approach, I want to highlight and discuss a few key privacy-related questions that we should all be asking. (Of course there may be more general questions to ask as well like how many mobile phones will not have access to these software upgrades? Answer: at least 2bn).

Can the data collected by Google and Apple be used to track people?

While some pieces of data would need to be pieced together, yes this is very much possible through what is deemed a “correlation” or “linkage” attack. As laid out in a recent whitepaper by researchers from UC Berkeley, Singapore University and here at Oasis Labs, “Bob” could set up his phone to collect all Bluetooth signals of those who are in proximity to them. Should one of those people later report that they have COVID19, it would be possible for Bob to receive the keys of that individual and go back and match the keys he collected earlier to that given individual. Ashkan Soltani, former chief technologist for the Federal Trade Commission, has even taken this one step further arguing that Bob could share this information on Nextdoor or other social media sites, ostracizing or flagging a neighbor or friend and potentially even putting that person at risk from others around them.

What will happen to my data after contact tracing is no longer needed? Can I get my data back?

Bluetooth proximity data will only be held by any given phone for 14 days, so while users would lose control of their data in the short-term, this should not be a long-term issue. Further, because all access to Bluetooth proximity data is opt-in there theoretically should be an easy way for users to opt-out whenever they choose. Of course it will be important that both Apple and Google ensure that opt-in/opt-out features are easily accessible by individuals at any time.

Could apps identify specific people with COVID19 with this data?

Yes similar to what has been outlined above via a correlation attack, apps could in fact run a similar correlation attack as well!

Can this data help to predict broader trends in given populations?

While contact tracing using on-device rolling proximity identifiers provide a great deal of privacy, not being able to perform additional analyses on actual contact information — using privacy-preserving techniques — makes public policy decisions more difficult because important epidemiological information remains inaccessible.

One such easily derivable information, if more contact information is available, would be the characteristics of the “contact graph” in different regions, a way to represent how individuals might contact others with what kind of frequency, etc, that epidemiologists use from a branch of mathematics called graph theory. Knowing the contact graph characteristics — and how it evolves over time, especially as shelter-in-place orders are relaxed — can help predict the rate at which an infection will spread through a vulnerable population. For example, a population consisting of many small tight-knit communities that are largely self-sufficient (very little likelihood of transmission across communities) is relatively safer than one where a significant fraction of each community must travel and interact with other, possibly distant communities: the former would correspond to a contact graph with many small separated (well-) connected components, and the latter could even correspond to a fast mixing “expander graph” where infections would spread quickly.

How do you ensure the integrity of the data provided?

Offline issues like data integrity are a challenge. While an app can capture data that any user provides, it may be difficult for the app to verify that this information is indeed true. This could be prevented by requiring each individual to upload a certificate of positive or false test results but this of course could lead to additional privacy issues.

While these privacy issues may be resolved by using a confidential computing platform like Oasis Labs (see some of the examples with the Parcel API), when trying to get at least 60 percent of any given population to participate — adding such a stringent requirement could also add way too much friction to the system.

Alternatively, ensuring data integrity will require an army of human “contact tracers” — individuals who actively seek out those infected and then track down all those who an individual has been in contact with. All to keep an active and up-to-date, yet very manual, social graph.

Final Thoughts

This article is not meant to criticize the efforts of Apple and Google — in face I applaud the investment and effort the teams are putting into this important work. It’s simply to highlight some of the questions we as consumers of their products and future data providers should be aware of when assessing whether to opt-in to giving apps utilizing their APIs our data. It may very well be worth the privacy risk for many of us, but it’s critical at the same time to be aware of the very tradeoff we may be making when we do.

Special thanks to Bennet Yee and Viswanath Raman for their input, edits, and comments on this blog post!

--

--

Anne Fauvre-Willis
Oasis Labs

@OasisLabs and contributor to the #OasisProtocol; former Apple / iPhone product marketer & Madeleine K. Albright staffer