A.I., Experience and Personhood

I work in the field of Artificial Intelligence, and although the actual job I’ve come to have is only peripherally related to that, I have a background in cognitive science and philosophy. Given this, people sometimes come to me with relevant questions. A while back, a friend of mine posed the following:

Is there a level of AI programming at which it would be unethical to disable a computer? Can killing a machine be morally wrong?

This bears on a number of things that I’ve thought a lot about, albeit more in philosophy of mind than ethics. While I’ve had a pretty consistent opinion about this stuff for some time, I don’t think I’ve ever produced a written summary of my particular brand of Dennettian pragmatism. So here goes.

A (Relatively) Short Answer

Ethics is enormously complicated and I’d actually like to avoid it as much as possible, but luckily I think I can get to the root of what you’re interested in without getting too far into the weeds of moral reasoning by rephrasing the question a couple times.

First, for various reasons, questions of the form “can doing X be morally wrong?” always have dumb answers that have nothing to do with what you’re interested in, so let’s try this: at what point (if any) does a machine attain the moral status and attendant rights of personhood? Does that sound like it addresses the basic concern?

If you want a TL;DR answer to that question, here it is: a machine ought to be considered to have the status of personhood when the best way of explaining or interpreting the machine’s behavior involves treating it as though it has an internal mental experience that’s roughly as rich as a person’s.

We’ll get there. First, we further dispense with the issue of morality. We can probably all agree that a big part of why we think of humans as having the kind of moral status that they do (i.e. the one that makes it so killing them is automatically bad, modulo extenuating circumstances) is due to their rich internal experience. We tend to correlate moral status with the richness of internal experience. Killing a sponge is no big deal. We have mixed feelings about mammals, that correlate with how comprehensible their behavior is in our terms. Killing dogs, which co-evolved to share a chunk of our psychology, is a big no-no, for instance. Killing a non-human great ape is obviously terrible; they’re most of the way toward personhood.

So it seems like the real question is actually a philosophy of mind question: how do we know whether something has a rich internal experience? It’s complicated, and it depends on your standards for using the word “know”.

A lot of philosophical problems creep up here because people tend to be unusually willing to indulge in skepticism for the sake of argument. You can always be skeptical about pretty much anything — for any given fact there’s a sense in which you could argue that you don’t reeeaaalllly ever know it (and yes, I’m including “I think therefore I am”) — it’s just a matter of whether that skepticism is useful to help you avoid the practical consequences of false beliefs, or whether it’s being taken too far for academic reasons. When it comes to discussions about whether various hypothetical beings have rich internal experiences, for some reason people are unusually ready to turn the skepticism dial all the way up, in a way that they wouldn’t normally outside of philosophical debate.

In fact, in such circumstances, the question is often asked about other humans: how do you know other people actually have a rich internal experience like yours? Maybe they’re all just mindless biological machines, and you’re the only conscious being in a solipsistic world. You can’t really get inside of their heads and find out, so there’s always technically room for skepticism.

But that’s not the kind of thing we usually worry about. One of the reasons for this, I’d like to suggest, is that hypothesizing about other people’s mental states is super useful. It makes for an easy way to explain their behavior. You can refer to their beliefs and goals and experiences to get a sense of why they behave the way they do. It’s not the only way of explaining their behavior. If you want to know why (or whether) I’m going to go get a cheeseburger for lunch, in principle you could always try and refer to stuff like how organisms like me are observed to seek foods with certain fat/protein mixtures, or how the chemistry of my neurons drives my muscles, or how all the quarks in my body are interacting at a subatomic level.

Despite being thorough, those aren’t actually very good explanations — they include a lot of unnecessary detail, and they’re not pitched at the right level for everyday use. Instead, you could refer to other facts: that I really like cheeseburgers, that I forgot to bring lunch today and that I know there’s a burger shop nearby. That’d typically be considered a really good explanation for my behavior, unlike an interminable discussion about quarks or the moving parts of my body.

So for a machine, I’d suggest we apply about the same level of skepticism that we do for other humans. When its behavior is complex and expressive enough that the best way to explain it is in terms of mental states that are about as complex as a human’s, then almost by definition the simplest hypothesis is that it has an internal experience that’s about as rich as a human’s.

Daniel Dennett calls the process of interpreting something’s behavior in terms of internal mental states “the intentional stance”, and more specifically, the process of trying to understand what someone else’s mental state is given the best evidence available “heterophenomenology”. I’m going to gloss over a lot of technical stuff here, and just note that one important feature of our own experience is that it’s a kind of active process that we go through, and that we have strong reasons to believe that it’s ultimately driven by underlying biological processes. If we apply heterophenomenology to a complicated enough AI, we’d expect to find processes that are in some way analogous (probably not perfectly so, but roughly) to our biological ones, and being able to correlate these with features of its behavior as interpreted through the intentional stance would be a very strong argument for considering the A.I. to have rich mental experiences.

There are still a couple of skeptical threats. How unlike our biological processes can the processes that drive an A.I.’s behavior be before we no longer consider them as evidence for mental processes? There’s probably not a very good answer for this right now, and it’s really limited by our knowledge of what’s required to generate the kind of behavior that’s best explained with the intentional stance. There’s probably a big gray area. But the most common skeptical threats don’t come from that gray area, they come from extreme hypothetical examples.

The basic way of generating skepticism about whether the intentional stance is appropriate is to hypothesize simple machines that certainly don’t undergo complex processes of the sort that we’d expect to correlate with mental experience, but nonetheless demonstrate enough complex human behavior to fool us into using the intentional stance.

A relatively easily dealt with threat comes from things like modern chat bots, which aren’t very sophisticated at all, and tend to pursue conversations though pattern matching over a relatively small number of scripts. We actually tend to over-anthropomorphize things. Today’s chat bots aren’t intelligent, but they can carry on a kind of limited conversation, and people are often totally willing to interpret their behavior in terms of the intentional stance, as though they had these complex mental processes.

But the notion that the intentional stance is the best way to interpret chat bot behavior falls apart if you press them. You find that the conversation often stops making sense, or there’s a very limited set of things they can do or understand. Chat bots will almost never demonstrate any kind of behavior that evidences strong working memory, extended reasoning or comprehension beyond surface level detail, because they’re simply not doing that stuff. The fact that humans are often easily fooled is a sort of sad commentary on interpersonal relations: a lot of our conversations are very shallow and can be fairly successfully replicated by following scripts.

If you try to ask a chat bot to do something like come up with a theory to explain something complicated, or imagine a fictional scenario you’ve come up with and work out the consequences of the features of the scenario and what it would be like to experience that, it’s going to break down, because that requires a kind of thinking that it doesn’t do.

So that brings us to the next threat, which comes from a kind of hypothetical chat bot taken to the extreme. Imagine you’d provided a chat bot with a script for pretty much every possible conversational topic, and it could always pass for an intelligent being with complex mental processes, but it wasn’t doing any actual information processing that resembles a mental process — it was just doing a kind of complicated table lookup. Surely such a thing doesn’t have mental experiences?

This is the basis of Searle’s famous Chinese Room thought experiment. I’d say such a thing doesn’t have mental experiences when you’re talking to it. Many seem to take this as a kind of reductio argument against A.I. — it’s an example of a machine that behaves like a being with a rich internal experience, but it clearly doesn’t have one, and all machines are doing some sort of analogous symbol manipulation, therefore machines can’t have rich internal experiences.

I think this does work as a reductio of the Turing test as a strict means of testing for intelligence, but I don’t at all think it’s a more general refutation of the possibility of conscious machines. There are a couple angles to attack it from.

One angle of attack is that the conclusion that’s drawn really has nothing to do with the machine nature of the thing, and precludes the possibility of using further information available from heterophenomenology to revise your beliefs.

If I was having a conversation with this amazing Chinese Room chat bot I might, at first measure and from its responses alone, take the simplest explanation for its behavior to be in terms of complex mental states that I hypothesize using the intentional stance. This is understandable, as most things I encounter that are actually capable of carrying on arbitrarily complex conversations really are intelligent beings. But if I point an X-ray at the damnable thing and determine that it’s actually this weird table-lookup machine, I’ve got new information and I’ll revise my beliefs.

Similarly, if I’m having an internet conversation with someone who I subsequently meet and they turn out to be a vast, genetically engineered beast (I’m picturing a huge vat of flesh connected to a keyboard) that happens to have a biological mechanism that operates in a way that’s analogous to the giant conversation lookup table of the Chinese Room, I’ll dramatically lower my confidence that I was having a conversation with an intelligent being (and dramatically raise my confidence that I’ve been catfished by a demon from hell).

Whether or not the intelligent seeming behavior comes from a machine or a biological entity is irrelevant — it seems to matter a great deal how that behavior came out: was it the result of a process a lot like my own mental processes, or some other gimmicky bullshit?

This brings us to the second angle of attack: you can’t build that machine. It’s impossible. There are an unbounded number of conversational scripts. You couldn’t fit them all in memory even if you encoded them using every electron in the universe. That simple lookup-table-based approach cannot feasibly recreate the full complexity of human behavior, even if it’s in terms of just written conversation.

What you’d ultimately have to do is build a more complicated machine. It’d need to be able to take its input and perform some very complex processing on it in order to handle all the possible variation and determine what to say next.

My wager is this: if you’re successful, that processing is going to turn out to be enough like thinking that applying the intentional stance will turn out to be the best way to interpret it. It’ll look a lot like a series of mental states, and the behavior of the system will reflect it having had something very much like the experience of thinking.

I won’t say that a text-based conversation system has the full richness of human experience — there are other feasibility issues regarding how it would come to be able to reason about things involving sensory states for senses it doesn’t have — but the point is just that anything that’s actually capable of consistently behaving like a human has almost certainly got something going on that’s analogous to internal mental experience.

So that’s that. If you meet an A.I. that’s capable of acting like a creature that’s intelligent enough that it’d be morally wrong to destroy it, it is just such a creature and it is just as morally wrong to destroy it as it would be an analogous biological creature.

There are a couple of other issues worth talking about, given that we’re already on the subject of A.I.:

First, if you meet such a thing, it will almost certainly not be as a result of a human specifically programming it to be capable of all those behaviors. Sci-fi is full of really dumb examples of intelligent machines that have capabilities strictly limited by what their creator programmed into them (e.g. “I’m not programmed for love!”).

That’s not really how A.I. works. Intelligence is too complicated to build piece by piece, it’s gotta be grown. The higher mammals have fewer and fewer “programmed-in” behaviors, and more and more that they learn from experience and socialization. Humans have almost no specific behavior that’s genetically determined, we’re almost entirely the product of the fact that we’re really good at learning stuff. The current thinking in the cognitive sciences is that very human-like things like emotions are actually high level cognitive heuristics that we learn in order to help us reason about things like social situations. We’re predisposed to be good at learning certain kinds of things, but a lot of the quintessentially human behaviors simply don’t develop without the right experience and socialization. Not all humans develop things like empathy, usually because of traumatic experiences often involving parents.
There are good reasons to believe that A.I. would generally have to work the same way. We’d make something with some raw motives and the capacity to navigate and learn within some kind of environment, and it’d grow up into an intelligent being with the capacities that it was able to acquire to successfully navigate its world. If it didn’t learn to experience the world in human terms at all, and didn’t develop something like empathy, it’d be because we didn’t provide it with the kind of experiences that make such things possible or necessary — or, in other words, because we were really bad parents.

Second, you may not ever meet such a thing. Despite a lot of recent advances, we’re really a long ways off. I’d have to write a great deal more to really make this argument persuasive, so you’ll just have to bear with me here while I make some relatively controversial claims.

There are some superficial similarities between things like artificial deep-learning neural networks and the biological neural networks that make up human brains, but it’s a difference in kind not a difference in scale. The kind of learning ANNs do (even the newfangled deep learning ANNs) is really radically different than the kind we do, even though the results sometimes sort of seem the same. It’s generally supervised and involves a great deal more repetition and fine tuning than we do. We’re intelligent agents who will do a great deal of reasoning, hypothesizing and abduction from a very limited number of examples. Modern A.I. is just not capable of that, and there’s no path from where we are now to real A.I. that involves just scaling up what we’re currently doing — there’s gotta be some really fundamental paradigm shifts around unsupervised learning and agentive behavior first. When people talk about current state-of-the-art A.I. they’re almost always talking about machine learning, which is a baby step beyond just fancy statistics. Given the state of the world, if I had to bet on whether we develop true A.I. or extinct ourselves first, I’d bet it all on the latter. I got into A.I. despite this because, well, a man can hope.