Machine learning is increasingly infiltrating our lives, both in ways we can identify, control, and interact with — and in ways we may not even perceive and cannot challenge. What ever happened to privacy? Is it a meaningful concept? Is it worth trying to preserve? Is it a right, a luxury, or an illusion in an era of data collection and connected, learning artificial intelligence (AI)?
As part of the “Conversations at RAND” series, designed to highlight important and timely research, a panel discussion on “The Collision of AI and Privacy” brought together information scientist Rebecca Balebako and engineer and Pardee RAND Graduate School professor Osonde Osoba Tuesday, February 20, at RAND’s Pittsburgh office. Director of the Engineering and Applied Sciences Research Department, Pardee RAND professor, and Impact Lab codirector William Welser IV moderated the discussion, welcoming guests to an exploration of “two of my favorite topics: killer sentient robots, and Big Brother.”
Welser set up the debate by defining some key terms for the purposes of the conversation. AI, while representing many things to many people, is primarily “a nonbiologic, autonomous learning system.” Privacy comes in many varieties, but a good working definition is “when one has control over what is shared, known, or exposed about themselves.”
Smart devices know our needs because they are good at recognizing patterns, and human beings are creatures of routine.
We are all surrounded by AI, constantly observing our habits, timing our comings and goings, noting our tastes. Netflix recommends movies you might like based on your viewing habits — and, as Osoba discovered when he shared an account with his sister, your gender, too. Your iPhone tells you unasked how long it will take you to get to the gym — because you often go to the gym at about this time, as Balebako realized when her phone volunteered a traffic report. Smart devices know our needs because they are good at recognizing patterns, and human beings are creatures of routine.
“We are not so unpredictable,” Osoba said. “These systems are able to observe a wide array of patterns and signals — how often you open the fridge, how often you talk to your spouse at home, all that information goes in to inform, very automatically, very intelligent systems on what you might need in the future. It’s not that hard. If you compare the task of trying to imagine what I might need to the task of trying to forecast demand for a whole consumer base — Amazon does that every day, routinely, with artificial intelligence. Doing it for one person is not that difficult.”
The panelists brought a high level of expertise to this timely issue, having done the research behind recent RAND publications: Rethinking Data Privacy; Can Smartphones and Privacy Coexist?; The Risks of Artificial Intelligence to Security and the Future of Work; and Fake Voices Will Become Worryingly Accurate.
Balebako, whose recent work has focused on smartphones, raised concerns about those vital tools we keep always within reach and rely on so heavily — typically without even reading the user agreement. Few of us understand, for example, that it isn’t just Google Maps monitoring where we are. “You may get information about what types of data are being collected,” she cautioned, “but … it doesn’t tell you where that data is going or how often. So if you’re actually seeing your location was collected 1,000 times in the past 24 hours — and that is not an unrealistic number — and it went to 13 different advertising companies, that’s when people start to get surprised. … It’s a surprise: It’s unexpected use of information; it’s an unexpected amount of information. And you don’t know who you’re sharing it with.”
Advertisers want data that will tell them when you’re moving through a major life change, such as a pregnancy, marriage, divorce, or home purchase, because such transitions change your needs, making you vulnerable to a sales pitch. But Osoba made the argument that vulnerabilities discovered through data collection don’t have to be exploited for gain. He cited the use of social media data to map influenza outbreaks as a public-health good accomplished through data collection.
Welser steered the discussion toward questions of fairness and equity in regard to privacy and data use. Machines seem at first glance to be inherently free of bias, but they learn based on the biases of their creators and the creators of the data sets and models that make up their “experience.” “And it’s not just a question of men versus women, or race,” Osoba explained, “Anytime there is a sort classification of the general population, if there is not enough data on all parts of the population on which you are deploying the system, you’re going to have these disparities in fairness, disparities in outcomes, which sometimes matter quite a bit.” As an example, he cited an algorithm trained to estimate recidivism risk that showed considerable racial bias.
Welser compared an algorithm to a recipe, whose ingredients are data; the algorithm sets up “how you’re going to take that data and use it to a particular end,” he said. “But it’s not just the data that you’re feeding in; it’s also the people who are creating those recipes, and their implicit assumptions about the world.”
Where does all that leave privacy? Is privacy an inalienable right? Is it for sale? “I would argue that you need privacy to support a lot of the other rights that we have,” Balebako said, citing the chilling effect surveillance may have of freedom of speech. “Yes, privacy — although it’s not written into our Constitution explicitly — it’s a human right.”
The right to privacy has become less available to low-income individuals.
The right to privacy, however, has become less available to low-income individuals, whose use of government services may depend on surrender of personal data via questionnaires and forms that must be filled out, and whose neighborhoods may be watched by anti-crime cameras. “Different amounts of data…are being collected about different populations,” Balebako said, “in particular vulnerable populations. We’re collecting more information about them, particularly to use to find out if they’re doing something ‘wrong,’ To see … I would like to see a crime map,” she added, “that includes white-collar crime. I haven’t seen that yet.”
Osoba discouraged bias against AI, asserting that, particularly in the developing world, AI can do “awesome things.” He also sees a role for RAND in the developing upheaval AI brings to the labor market, as it begins to perform white-collar jobs as well as manufacturing roles. “It’d be interesting, as a policy think tank, to start thinking about robust mechanisms … to try to safeguard people’s livelihoods even when they lose their jobs, even when they’re switching between jobs.”
Far from urging us simply to welcome (or battle) our robot overlords, the panelists highlighted positive trends and developments in the evolving AI-privacy landscape. Balebako hailed companies that temper their expansive data collection by “blurring” or aggregating consumer data, and Osoba praised machine learning systems that explain their decision process rather than hiding it in a black box. Privacy concerns can act as a drag on AI development, so viewing their tension as a zero-sum game where one hampers the other is not the best way forward. Questions of privacy and the public good can be navigated by keeping human beings in the loop, keeping AI transparent, and always ensuring that AI is aligned with our societal norms and values.
Not that that will be easy. “Even understanding these complicated algorithms and systems, there’s a lot of work that needs to happen before we can get there,” said Balebako.
“And I would love to do that work.”
— Samantha Bennett
This originally appeared on The RAND Blog on March 1, 2018.