AI machines as moral agents, Minimal agency. (part 7)

H R Berg Bretz
13 min readDec 12, 2021

--

Part 6 was about defining agency and a common view which I called Intentionality agency, a “thick” definition. Here it is contrasted with minimal agency, a “thin” definition.
For a mission statement, see Part 1 — for an index, see the Overview.

Does an insect have agency? Photo by Elegance Nairobi on Unsplash

3.2. Minimal agency

Xabier Barandiaran, Ezequiel Di Paolo and Marieke Rohde’s aim in “Defining Agency” is much more ambitious than what is needed for this text, but they are looking for the same thing we are interested in, a definition of agency that does not rely on obscure properties[1]. They say that “most current researchers assume an intuitive and unproblematic notion of agency” (2009, p. 1). Torrence (2014), David Davenport (2014) and Etzioni (2017) discuss ‘artificial agents’, but none of them have a definition of what they mean by agency, which supports the quote from Barandiaran et al. Floridi and Sanders explicitly say that they want to avoid defining agency, because ‘agent’ is a vague term, and it is hard to encase ‘agent’ with “the usual planks of necessary and sufficient conditions” i.e. a definition. Agenthood is one of those terms along with life, intelligence and mind that cannot be defined “because they all admit of subtle degrees and continuous changes“ (2014, p. 352). Instead Floridi and Sanders introduce the concept of a Level of abstraction from the field of mathematics. (2014, p. 353). This shows that Barandiaran et al’s minimal agency indeed could be helpful here, to spell out a type of agency that seems to avoid the problems associated with intentionality agency. Let us next look closer at their definition of agency and other minimalistic accounts and their respective compatibility.

3.2.1. Barandiaran et al’s minimal agency

Barandiaran et al propose three conditions for agency: 1) individuality 2) asymmetry in action 3) normativity in action (2009, p. 1).

By individuality they mean that the agent must be distinguishable from the environment. This is often taken for granted or seen as trivially irrelevant but could be vital when considering gases or components that are functionally lumped together (2009, p. 3). I will not go more into this since for my purposes this is also, as they say, trivially irrelevant. That is, I assume the general statement that it should be somewhat clear where the agent ends and the environment begins for the types of artifacts discussed here[2].

Asymmetry in action (“interactional asymmetry”) is the condition of doing something, being active, not passive — “breaking the symmetry of its coupling with the environment so as to modulate it from within” (2009, p. 3). The agent is managing and gathering the energy resources for action, like a bird that is gliding on the wind. The resource is the wind and the bird asymmetrically counteract this resource (breaking the symmetry of its coupling) to achieve its goals using the wind to get where the bird wants to go. The temporal aspect is often significant, as it is often that the agent ‘acts first’, illustrated by the difference in a person falling off a cliff and a person diving from a cliff. However, these are not unproblematic aspects as the environment can assert energy on the agent and do it before the agent. Barandiaran et al suggest a statistical approach to determine the asymmetry and summarize this as “An agent is a system that systematically and repeatedly modulates its structural coupling with the environment” (2009, p. 3–4). In other words, the agent statistically and systematically controls the environment more than the environment controls it.

Normativity is the condition that tries to rule out arbitrary and random interaction. It is what transforms modulation into regulation to satisfy a given norm. A norm could be the agent trying to achieve a goal. The norm must be followed, everything else is a failure. Planets, for example, are not agents because they cannot “fail” to follow the laws of nature. These norms must not be external but based on the very “nature” of the agent except for social norms which are per definition external, but are internalized by the agent (2009, p. 5).

These are all necessary conditions of agency on their own, but sufficient when taken all three together. They are related to each other, but not all on equal terms. Individuality seems to be a precondition to normativity and asymmetry, they would not make sense without it, there would be nothing to apply the other two conditions on without individuality (2009, p. 6).

Let us now consider some other ideas that relate to this definition of agency, ideas that support and expand on Barandiaran et al’s definition.

3.2.2. Legg and Hutter on machine intelligence

When Shane Legg and Marcus Hutter try to define machine intelligence, their formal definition of an agent is “a function, denoted by π, which takes the current history as input and chooses the next action as output” and the agent uses a reward system to measure the success of its actions. (2007, p. 17–19).

(Illustration from Legg and Hutter, 2007, p. 16)

The agent can interact with the environment, send and receive information from it. They suggest that the best way to assign a goal for the agent is to say that it should maximize its reward. It is also possible to give the agent a specific goal but that would limit it to only have one goal. Another way would be to let the agent know its goal by using language, but that is generally a too strong an assumption to make of the agent, since even as complex agents as dogs cannot really be said to understand language. By saying it should maximize its reward, when you change the reward structure you can also change the behavior of the agent. This goal is then both fixed and flexible at the same time (2007, p. 15–16).

It seems that Legg and Hutter has assumed the individuality condition, just as Barandiaran et al claimed is often done when discussing agency. This goes for the asymmetric condition too, since otherwise I do not see how they can discern the agent from the environment. If you say that a system can act (produce output), there must be an asymmetric relation to what it acts upon, otherwise how can you say that the system is the source of the act?

The normativity condition and the goal achieving mechanism seems very compatible, and here is where the heart of their paper lies, it deals with how successful the agent is at achieving its goals; how “intelligent” it is. Legg and Hutter’s definition of intelligence is “intelligence measures an agent’s ability to achieve goals in a wide range of environments” (2007, p. 12). My analysis is that intelligence is not something that, according to Barandiaran et al’s criteria, defines agency, because there is nothing that says the agent needs to be good at achieving its goals, i.e. being ‘intelligent’. However, if the system in no way achieves the goals, it is not an agent since then you would have to claim that either it has no goal or that if it has goals it does not try to achieve them, and either of the two alternatives would violate the normativity condition, that the agent tries to achieve its goals. This brings us to the question of how to attribute goals to an agent and if the agent actually has that goal, something that Dennett’s belief/desire model has an answer to, where desires are the goals the agent tries to satisfy.

3.2.3. Daniel Dennett and the belief/desire model

Daniel Dennett’s instrumentalist view of the intentional stance says that it is appropriate to attribute beliefs and desires when this accurately predicts the agent’s behavior (1987: Ch. 2). The description that an agent has beliefs about the world and uses those beliefs to make choices how to act in order to satisfy its desires is similar to what Barandiaran et al refer to as normativity, that in order to distinguish an agent from a non-agent, the agent must be said to try to achieve its goals or follow some norm. The belief/desire model is the most widely accepted model for understanding intentional action. Belief is the ability to represent the world, and desire is motivation or instructions for what to achieve in the world. For example, if an entity believes that there is water in a cup and has the desire to drink water, it would be rational for the entity to reach for the cup and drink its content.

An interesting contrast to something that is best described by the intentional stance is an artifact like a thermostat which is usually better described by the ‘the design stance’ as Dennett calls it (2009, p. 3)[3]. That is, it is better explained (e.g. has better predictive power) than the intentional stance. It is possible to say that a thermostat believes the temperature is too low and desires to increase it but that only has a very narrow range of explanatory power. If you explain it to a child the belief/desire model of the thermostat could be useful, but to an engineer it is not. By that I mean that beliefs/desires for a thermostat can be interpreted in many ways, and many of them do not correctly describe the behavior of the thermostat while a simple mechanistic / physical design stance can be very accurate and does not invite as many false interpretations. The point is that just because you can describe something as having beliefs/desires does not mean it is appropriate. The belief/desire model gives a description of the intentional action of agency that does not imply that the agent has or needs to have some level of consciousness, which supports my argument for minimal agency.

The instrumentalist stance makes no claim of realism, that beliefs or desires are real, only that when an explanation with beliefs/desires is instrumental to describing the entity, then it has beliefs/desires. This makes it possible to say that an artifact can have the mental states of beliefs and desires. A critique against this view is that the instrumental stance might make it easier to attribute mental states in cases where it is hard to know if the agent has these mental states or not, but it does not tell us if it actually instantiate these states. This is true, and it is therefore important what kind of object these mental states are, if they are ‘real’ objects that are in fact instantiated in an entity or if they are nothing more than something that appropriately describes the behavior of the entity. I will not try to settle this issue here, but if you want to argue that an artifact can have beliefs and desires, then it should be clear what is meant by claiming that it instantiates these states, and Dennett has an explanation which could explain it. Also, if you were to side with realism here, that could invite the problem of other minds, because if you say that an entity have to actually have these states, how can you claim that other humans have them, and if you cannot, how can you say that they are ‘real’ or what they consist of?

Another way to approach how to define agency is to see how it is commonly used in everyday language. Dictionary.com has 14 definitions and seven of them are: a person or thing that acts or has the power to act; a natural force or object producing or used for obtaining specific results; an active cause, an efficient cause; a person responsible for a particular action; a substance that causes a reaction; any microorganism capable of causing disease; a drug or chemical capable of eliciting a biological response[4]. Many of them suggest a wider definition than intentionality agency, implying that a wider definition is intuitive in our language. Now, this is a weak line of argument, since there are many other possible explanations for this. For example, some of these uses are more or less contradictory and has different scope, making it not really clear what the ‘everyday use’ consists of. But it shows that a wider scope of agency is at least not unintuitive. And to give this some strength, I will next present research from psychology, backed up by empirical evidence, which suggests that a wider agency similar to minimal agency is how we intuitively perceive agency.

3.2.4. Fiala et al and the psychological notion of agency

Brian Fiala, Adam Arico, and Shaun Nichols draws on recent psychological and cognitive experiments which supports the notion of minimal agency. They claim that humans are inclined to apprehend a basic form of agency very fast, provided that some basic conditions of the entity are met. This claim (the agency-model) is based on Fritz Heider and Marianne Simmel’s (1944) landmark study where they showed participants an animation of geometric shapes (triangles, squares, circles) moving in non-inertial trajectories[5]. The study found that the participants used words like “chasing”, “wanting” or “trying” to describe these animations, implying that merely by the movement in the animation, the participants attributed agency to the symbols. Also, more recent studies on infants found similar results (2012, p. 5)[6]. Not only is this ‘fast-agency’ attribution supported by empirical evidence, based on another set of research[7], Fiala et al also propose that the dynamics between a fast and a slow system explains many of our everyday attributions of consciousness (that are sometimes biased), and that those attributions are the result of the fast system. The ‘fast-agency’ attribution is part of the fast system (system 1) in a dual process theory:

“A crude version of dual-process theory holds that mental systems fall into two classes. In one class, System 1, we find processes that are quick, automatic, unconscious, associative, heuristic-based, computationally simple, evolutionarily old, domain-specific and non-inferential. In the other class, System 2, we find processes that are relatively slow, controlled, introspectively accessible, rule-based, analytic, computationally demanding, inferential, domain-general, and voluntary.” (2012, p. 3)

To illustrate the fast vs the slow system, they consider this argument:

“All unemployed people are poor.
Rockefeller is not unemployed.
Conclusion: Rockefeller is not poor.” (2012, p. 4)

The fast system incorrectly makes many people judge this argument to be valid, but with some basic reasoning, the slow system will convince most that this is not a valid argument. Now, they continue, all mental process might not tidily fall into this paradigmatic division of slow and fast, but they believe the (fast) agency-model does (2012, p. 4). To test their model, they presented subjects with a sequence of Object/Attribution pairs (e.g. ant/feels pain) and the subjects were asked to respond as quickly as possible whether the object had the attribute or not (Yes/No). And the result was “Participants responded significantly more slowly when they denied conscious states to objects that do have the superficial AGENCY cues, namely, insects” (2012, p. 7). This supports their theory that this agency model is correct. It also suggests that the consciousness attribution is something that is given by the slow, deliberative system[8].

I find that minimal agency corresponds well with the agency model attributed here by the fast system, both avoiding the consciousness attributions. This means that there is some evidence that the minimal agency is an intuitive concept, presumably the result of evolution. Of course, this does not necessarily mean that the slow system is incorrect or that the fast system should be trusted more, only that minimal agency is very natural concept for humans. My claim is that the simplest explanation to our everyday use of ‘agent’ corresponds with our evolutionary intuitive concept of agency, which corresponds with what Barandiaran et al are trying to explore and define. Also, the slow system could be biased to overrule the fast system when it finds agency in entities that is not usually associated with consciousness.

With this in mind, we can now distinguish three different levels of agency: The knife, which is an artifact, but has no agency. The autonomous robot or an animal, which (can) have minimal agency, and a human (or other advanced beings) which in addition has intentionality agency. If the definition for minimal agency is what separates the non-agent from the agent then that difference is (somewhat) clear, but what differentiates the minimal agent from the intentionality agent is less clear, making that an important task for future development in philosophy of mind and cognitive science to explore. Next, I want to touch upon Floridi and Sanders’ reluctance to define agency because this very fact can be a reason for why minimal agency should be preferred over intentionality agency

Comments:

Rereading this, I’m not satisfied with my explanation of the belief/desire model — it’s too shallow and it doesn’t help the argument that much. But generally, I still feel that the concept of minimal agency is salient, and the examples here are interesting. The basic argument is that AI machines probably doesn’t have mental states, but they still have “more” agency than dead objects. And we need to define that and figure out how it should be included in our moral reasoning.

Next part -8

Footnotes:

[1]They are interested in a much more fine-grained definition that could determine e.g. whether the tremors affecting a patient with Parkinson disease or protocellular system pumping ions outside its membrane could be counted as an agent.

[2] This is not to say it does not concern artificial agents. One interesting question is what makes a robot an individual agent? If a human uses an artifact, it is often unproblematic to distinguish between the artifact and the agent (e.g. the artifact is often non-biological) but when is a ‘robotic arm’ part of the robot or an artifact used by the robot? It might also have substantial effects on defining agency, because if it the individuality condition cannot pick out an agent, there are no agents at all. This is a relevant, but too complicated issue to delve into here.

[3] They could be something like: if input temperature is <X, turn on heating. If input temperature is >X, turn of heating, although most thermostats probably uses some kind if mathematical PID regulation (Proportional–Integral–Derivative).

[4] Many of the other are specific definitions of similar phenomena, like a representative of a business, government or election. https://www.dictionary.com/browse/agent (2019–05–16)

[5] Once you see the animation this becomes much clearer, and I think most people find it very intuitive to ascribe agency to these shapes, with nothing more to go on than the shape’s movements (no sound). One example can be found on YouTube, https://www.youtube.com/watch?v=VTNmLt7QX8E (2019–05–16)

[6] This is how they describe the research: “Johnson, et al. found that when the fuzzy brown object included eyes, infants displayed significantly more gaze-following behavior than when the fuzzy brown object did not include eyes. They also found that infants displayed the same gaze-following behavior when the fuzzy brown object, controlled remotely, moved around and made noise in apparent response to the infant’s behavior. Johnson and colleagues explain these effects by suggesting that when an entity has eyes or exhibits contingent interaction, infants (and adults) will categorize the entity as an agent. Once an entity is categorized as an agent, this generates the disposition to attribute mental states to the entity, which manifests in a variety of ways, including gaze following, imitation, and anticipating goal-directed behavior.” (2012, p. 5)

[7] Specifically: “moral judgment (Haidt, 2001), decision-making (Stanovich and West, 2000; Stanovich 2004), probabilistic reasoning (Sloman, 1996), and social cognition (Chaiken and Trope, 1999)”.

[8] There is a caveat here, the example of “ant/feels pain” misses my target somewhat since “pain” is generally associated with being a moral patient, not a mere agent. However, pain is also associated with consciousness, and consciousness is the part of intentionality agency that is problematic according to my argument.

--

--

H R Berg Bretz

Philosophy student writing a master thesis on criteria for AI moral agency. Software engineer of twenty years.