Entering the Minefield of Digital Contact Tracing

Published in

Berkman Klein Center Collection

10 min readMay 5, 2020

People across America and the world remain under strong advisories or outright orders to shelter in place, and economies largely shut down, as part of an ongoing effort to flatten the curve of the most virulent pandemic since 1918. The economic effects have been predictably staggering, with no clear end in sight.

Until a vaccine or other transformative medical intervention is developed, the broad consensus of experts is that the only way out of mass sheltering in place, if hospital occupancy curves are to remain flattened, entails waiting for most of the current cases to resolve, and then cautiously and incrementally reopening. That would mean a sequence of allowing people out; promptly testing anyone showing symptoms — and even some who are not; identifying recent proximate contacts of those who test positive; and then getting in touch with those contacts and, if circumstances dictate, asking or demanding that they individually shelter until the disease either manifests or not. The idea is to promptly prune branches of further disease transmission in order to keep its reproductive factor non-exponential.

Identifying contacts — “contact tracing” — is thus central to restoring a modicum of normalcy in daily activities without reigniting viral spread. In a world where everybody carries a smartphone, there’s an understandable desire to leverage digital technology as a means of making contact tracing happen more quickly and more accurately.

We’ve seen a surge by public health authorities to scale up their existing contact tracing architectures, using tried-and-true best practices. At the core of these methods is a municipal employee or contractor who gets in touch with someone who’s had a positive test result and then interviews the person at length to review and record a week or two’s prior contacts as best as the patient can remember, and also to offer information and advice about contending with the disease and any necessary isolation protocols. In Massachusetts, for example, state officials are working with the non-profit Partners in Health organization to hire, train, and integrate a thousand new contact tracers. Chicago is doing the same, as part of what might be a nationwide hiring and training of about a hundred thousand new contact tracers to keep pace with the virus’s transmission. These methods may use technology to organize what they learn — there are databases that keep track of what contact tracers can elicit from patients — but they do not count on device-level telemetry to help with the job of contact elicitation.

Separately, the technology behind what have been variously called digital contact tracing or exposure notification tools could greatly inform and aid the traditional mechanisms of municipal health authorities.

As I’ve tried to develop my own thinking around privacy, efficiency, and equity, of such tracking tech, there are some big issues to work through with efforts to draw from the digital app/data side to support municipal public health activities rather than standalone apps. Given the stakes, they’re worth taking on.

1/ When there is a trusted health entity doing the work, more kinds of data could be used to help. Traditional contact tracing involves collecting lots of sensitive data from people so as to construct as full a sphere as possible of who’s been near whom. Sharing all that information in, say, an interview over the phone with a health official involves the authorities establishing and being bound by a trust that the data will be used for the limited purposes of contact tracing in the public interest. If data of the sort that digital apps could collect, should they soften any technically-enforced de-identification requirements, is channeled this way, it is feeding into a system that already has (or should have) privacy top of mind, is governed by specialized privacy law and professional norms, and is naturally decentralized one health department at a time.

That could be reason to be open to the movement of more kinds of data than, say pseudonymous proximity only. For example, who’s to say that GPS location data over time would not be additionally useful, even if not a substitute for more precise proximity of the sort a little better provided by short-range Bluetooth low energy radios built into newer smartphones. (Though there are accuracy worries about that too.) Rather, map data could help contact tracers know whether an already-established proximate contact took place inside or outside, in a place with small rooms or large ones, in a dining establishment vs. an ER waiting room. The more data a health official has, the more a useful, artisanal judgment about contacts can be made. That’s one way of avoiding the sorts of “facile, naive models of the world” about which UCSF’s Ali Alkhatib recently warned.

The emerging Apple/Google contact tracing framework doesn’t take this kind of extra data into account, since it’s built around minimizing everything from the start, with a narrow starting point that’s designed to facilitate proximity detection between two phones. The idea is that all anyone really needs to know for digital contact tracing is when two people were near each other and for how long. But that may be only the start, rather than the entirety, of how mobile device data can help.

Interestingly, in its current design, any app using the new Apple/Google API will be forbidden from collecting location data that any other app, with user permission, can. So the user cannot opt in to a contact tracing app collecting location, even as half a dozen weather apps and Farmville installed on the same phone are doing just that. This seems a variety of Zeynep Tufekci’s and Maciej Cieglowski’s points about the longstanding massive private surveillance infrastructure that already exists to try to target us with just the right ad at the right time — and how while we should surely try to eliminate that overall, we might be open to thoughtfully doing something useful with such cavalierly collected and shared data to address a genuinely severe public health crisis.

2/ There’s an overlooked difference between opt-out/opt-in adoption models and voluntary/mandatory data usage. A user’s choice about sharing their digital data for a contact tracing scheme might be better placed at the later decision point about using already-collected private data, rather than earlier, such as deciding whether to collect the useful data at all. I’m still thinking this through, but it seems that to have, say, an iPhone automatically start locally collecting data known to be potentially highly useful, under the protection of the phone’s passcode or biometric security that is geared to protect lots of other sensitive user data, gives users more options at the point they get a positive test result and very much want to share as much data as possible to help determine who else has been exposed — data that could be made to be shareable only to certified public health authorities. The Apple/Google framework — and all others that I know of right now contemplated for U.S. domestic use — doesn’t take this into account, since it’s built around opting in to the data collection to begin with.

3/ Voluntary vs. mandatory can obscure the real issues. I’ve long been skeptical of user choice, whether “opt in” or “open out,” as the touchstone for privacy protection. People are too readily goaded into choosing manifestly bad deals for themselves, and the fact that consent was nominally obtained shouldn’t excuse a counterparty — typically a company with much more knowledge and skill about what an app is really collecting and what will happen to the data it collects — from anything bad they do or that happens. That’s why I’ve been developing the information fiduciaries model with Jack Balkin, where companies would be required to not screw people over, rather than offering people an “option” between being screwed over or not.

Still, one typical reason to find value in a choice regime is that people may have different tolerances for how their data is to be used. Some people welcome getting occasional emails from an online store they’ve purchased from before. Others consider that spam. Instead of a one-size-fits-all solution in the name of consumer protection, people can choose between being on a mailing list or not. But that reasoning is at its weakest in the case of data for contact tracing. If the data is not useful — something that some of the linked essays in this conversation have argued — then there’s not much reason to give anyone a choice about opting into digital contact tracing schemes to assist municipal public health workflows. They just shouldn’t be allowed. If the data is useful, and indeed more useful the more people who contribute, then there is a strong argument to make the collection and use mandatory, with technical and legal safeguards against misuse. Should misuse happen, even those who opted in to a voluntary scheme didn’t bargain for that — the privacy violation against them is no less because they chose to participate, assuming that their privacy wouldn’t be abused. The key, in considering mandatory collection, would be to see if there’s a satisfying level of protection against abuse to make such a system worth it. If there isn’t, I’m not convinced there’s even a case for a voluntary system that goes beyond the carefully circumscribed limits set by the Apple/Google framework. It’s all or nothing.

This ties in to some of the points around equity made by organizations like the Movement for Anti-Oppressive Computing Practices. Responding to the U.S. Census is mandatory. The census solicitation letter says in all caps that YOUR RESPONSE IS REQUIRED BY LAW. There’s a strong equity argument supporting that — people afraid to be counted will disproportionately be the marginalized and oppressed, and by not responding their power will remain that much more diluted. And the fact that the census collects sensitive personal data doesn’t prevent it from being mandatory. Even in the short form it immediately asks about membership in groups that traditionally have been the basis for discrimination, after asking for identifying information of name, address, age, and the same for all other members of one’s household. While the census remains mandatory, out of respect for the special sensitivity of the data collected, its survey data is fenced off from the rest of government, including the FBI. Individual responses are under lock and key, more or less, for 72 years. Contact tracing data could be similarly protected.

Of course, if it’s to be mandatory in some way, there will be big questions about how people without the latest smartphones can be expected to participate. Maybe it means free phones; maybe it means it’s mandatory only for those with devices capable of collecting useful data. Certainly anything that makes municipal contact tracing faster or more accurate in some cases, even if not for all, represents a contribution to the public at large, given the permeable dynamics of the spread of an illness like this.

This also illustrates a difference in the frames that we might bring to this problem. Designers of digital contact tracing apps, not intended for integration with municipal initiatives, may be thinking of the app user as the “customer.” That naturally leads to judging the app’s utility on whether or not it’s providing useful advice to the app user, bringing to mind Bruce’s and others’ critiques about whether such apps will yield too many false negatives or false positives to be helpful to that person. But the public health frame sees it differently. There the central question is whether an app, alone or in conjunction with municipal tracing efforts, helps to keep or get the virus’s effective transmission rate below 1. If it does, the fact that a lot of people might be told to self-isolate when it turns out they didn’t need to is OK, so long as lots of other people can be out and about without pushing the rate back above 1.

All of this is, to be clear, against a backdrop of whether the extra data in question could be game-changing for public health uses. If it isn’t, then of course we stop at square one and (regrettably?) need not enter the minefield at all.

4/ If we don’t explore useful, privacy- and equity-sensitive broadly-implemented solutions, what will happen next? The alternative may not be the status quo. Rather, we could see either complete government failure or abdication that could pave the way for inequitable second-run solutions. I could see a future in which big employers offer test-and-trace to their employees — or for that matter, require it — in a faint echo of the company towns of yesteryear. So people affiliated with, say, big employers like, say, TJX or General Motors, or major research universities, might see testing and tracing regimes within their communities, however limited the value overall. That’s bad from a community equity standpoint.

Alternatively, we could see a single-source Federal government contract, Department of Defense-style, to a Northrop Grumman or other contractor not versed in thinking through consumer privacy issues — in which case we might see all sorts of surveillance with none of the offsetting protections that seem vital if it’s to happen at all. I don’t mean to assume this as a way of bulldozing towards some preordained solution.

“It could be worse” is a weak argument against something that’s already bad. But it also doesn’t mean that we shouldn’t be gaming out what’s likely to happen in various plausible futures, when so much seems up for grabs, and picking what we can best estimate as the least worst path through, taking into account our risk tolerance for what happens if we’re wrong about what we’re estimating, whether in the dynamics of the disease, of the economy, or of the uses and abuses of sensitive personal data.

I’m grateful to members of an internal discussion list at the Berkman Klein Center for a broad discussion of these topics.

Entering the Minefield of Digital Contact Tracing

Written by Jonathan Zittrain