Deception by Design — The Zombie Living Beyond the Uncanny Valley

Discussing the ethics of humanlike AI conversation agents

At its I/O conference in May 2018 Google previewed Duplex, a feature of Google Assistant able to pick up the phone by itself to book appointments at restaurants or hairdressers on behalf of its user, saving them time and the cumbersome experience of listening to scratchy classics in the phone loop. While the crowd of software developers cheered after hearing how the Google Assistant convincingly mimicked human behavior by integrating some “hmmm” and “ums” into its conversation with a hair salon receptionist, commentaries outside the Google bubble criticized the software as “Uncanny AI Tech”.
So what exactly is it that makes it uncanny? And why might a closer look on the ethical implications of humanlike AI conversation agents be worth your time?

Dating back to an essay by Masahiro Mori in 1970, the concept of the uncanny valley has long been discussed as a fundamental problem of robotics and artificial intelligence, describing the “sense of strangeness” we experience through the “negative familiarity” of something that is “quite human” but not perfectly so. As humans do not like to interact with systems which make them feel uneasy, much attention has been given to designing systems which accurately simulate humanness in order to keep the user engaged — an ambition which appears to become reality with novel AI voice assistants.

The development of intelligent assistants like Google’s Duplex imbeds artificial intelligence into systems designed to conceal their synthetic nature.

This new generation of personal voice assistants appears to have leaped the uncanny valley, no longer causing a feeling of repulsion with humans interacting with them. It is no longer the behavior of systems such as Duplex itself which gives rise to uneasy feelings, but instead it is the realization that such unease remains absent. It is the realization that very soon we could find ourselves in a world in which it becomes increasingly difficult for individuals to discern whether they are talking to a human or an intelligent system on the other end of the phone line. So, why would we feel uneasy about such scenario? Could all concerns be washed away by requiring intelligent assistants to identify themselves as such upon the beginning of a conversation? (Which appears to be the easy solution Google promised in response to the critics)

Let’s first take the scenario where individuals do not know that they are talking to an intelligent system. While some forum discussions dive deep into the question in how far such interaction would be disrespectful to the human who’d be deprived of a shared experience (following the thought that an AI would not really “experience” anything) which some appear to see as a precious ingredient to human interaction, I think this argument holds little ground when we look at the scenario presented in the Duplex preview: a restaurant waitress probably attaches comparatively little value to the “shared experience” between her and a customer when taking a table reservation. But I do admit that the experience-argument — especially when connected to the thought that the perception of a mutual experience might foster the development of trust and bonding between conversation partners — is worth considering once the application of conversation agents should move beyond the currently limited scope of appointment bookings and enter into the sphere of private conversations. (If you wonder by the way, why Google so far has restricted the system to hairdressers and restaurant tables — the reason is simply found in the amount of available training data; new application areas can thus be expected to emerge pretty soon). One issue already arising now however, is the aspect of power imbalances.

A human unknowingly interacting with an intelligent system will be unable to anticipate the actual capacities of their conversation partner, which — if intended — grants the same a great advantage in designing the interaction for manipulative purposes, following the idea of affective computing.

Of course one could raise the argument that such power imbalance is already visible today, given the widespread use of elaborate profiling techniques. With access to a system containing such customer profiles, a waitress taking your call could thus similarly adapt her conversation protocol to a version which has been calculated to most efficiently convince you to accept the offer of a 5pm table although you requested one at 8pm — but (unlike you) she knows you’ll spend 40% less on drinks than the usual crowd coming in at Saturday evenings, so better keep the spots during prime time open for someone else. (Again, we see that examples based on the previewed use case of table bookings remain modest on the scale of ethical precariousness — to increase the thrill, feel free to apply the expressed idea analogously to other contexts such as insurances, educational systems or the housing market)

So what is the difference between unknowingly interacting with an intelligent conversation agent compared to interacting with a human having access to the knowledge of such system, or to knowingly interacting with a system such as online booking sites?

Firstly, in comparison to a human, a system interacting with you might be able to adapt its behavior more precisely nuanced and thus more convincingly to fit your profile, resulting in greater manipulative power.

Granted, I do not think we are yet at the point where AI beats humans in human-to-human conversation, but tech moves fast these days… (also, it is quite likely that further advances of voice agents’ capabilities will speed up once they establish in the market, as every interaction equals additional trainings data being fed back into the system for further improvements)

Secondly, in comparison to knowingly interacting with a machine, there’s an issue of deceived expectation.Your expectation of what your conversation partner knows, intends and “hears”

(this last point playing with the idea that at some point intelligent conversation agents might also be superior to humans in detecting subtleties in your language, which they can then use as additional data points when personalizing their behavior to your responses). Do you know that slight unease which sometimes arises when talking to a trained psychologist and you suspect them to read way more out of every single word that leaves your mouth than you intend to express? That’s the idea I am going for. Just that in the case with a voice AI you could not even feel uneasy because you don’t know of its’ analytical capacities — thus you couldn’t even watch your words.

It would deprive you of your ability to form reasonable expectations of the power (im-)balances of a given conversation.

And — when turning away from the issue of manipulation for a moment, instead focusing on data protection issues in such constellation— it would consequentially undermine the idea of a “reasonable expectation to privacy”. Unfortunately, it is exactly that expectation which we need to make informed choices (okay, the use of “informed” in this context can be heavily debated but bear with me for the sake of argument here) regarding the informational content we are willing to share — the data we are willing to reveal.

In fact, the implications on and arising conflicts with current privacy laws constitute a matter of such complexity that an adequate discussion should be postponed to a separate article. But just to provide some food for thought: How do you consent to the processing of your vocal information if you don’t even know such processing is taking place on the other end of the line? How sensitive is the information that can be deduced from your voice or the words you use? Ethnic origin, sexual orientation, political or religious beliefs, psychological health? And what about the principles of purpose specificity and data minimization — how many words or seconds of voice recording does that include?

Responding to critics, Google promised that, once going to market, Duplex will identify itself as personal assistant system at the beginning of a conversation — a measure to address the concern of deceiving humans into believing they’d be interacting with a real person. Although that’s a step forward, with the arrival of AI conversation agents an increasing imbalance of knowledge (in form of data-access and -computing power) between communication partners is looming on the horizon. As said, these imbalances are nothing entirely new, but they appear to turn progressively more (trust-)intrusive as a result of the interactive-nature of deceivingly good conversational AI, which simulates a “relationship” between two conscious actors. Research shows that even if individuals know that they are merely talking to a machine, humans show a tendency to develop a sense of trust and disclose even personal details to the same. A disclosure from the side of Duplex might thus do little to the subjective feeling of interacting humans: the feeling of a normal interaction, disguising the computational power and thus manipulatory capacities of the system on the other end of the phone line.

Persuasively human voice assistants illustrate how data processing activities increasingly vanish into the opaqueness of ubiquitous computing, and emphasize that the trust vested into data subjects’ ability to make informed decisions about their own privacy has to be challenged in a time where technologies are consciously designed to deceive.

Moreover, a question already heard in the context of Deepfakes has to be raised: how does life feel like once we experience increasing difficulties to discern the real from the synthetic? Couldn’t you think of situations where you might prefer introducing yourself as your personal AI at the beginning of a phone call in order to dodge the full responsibility of your words? Reversing the simulation game to freeload off the lenience we grant to the systems we build…

Image: https://flic.kr/p/hJV4qf