AI machines as moral agents, Defining moral agency. (part 10)

H R Berg Bretz
8 min readFeb 7, 2022

--

In Part 9 I discussed how it would be possible for a programmed agent to be free enough to make decisions of it own and not its creator. Now its time to summarize previous claims to a definition of moral agency.

For a mission statement, see Part 1 — for an index, see the Overview.

Photo by Possessed Photography on Unsplash

4.2. Defining moral agency

Floridi and Sanders’ approach to defining moral agency is a bit different than mine because they define moral agency without defining agency at all. This makes it not obvious how to compare their three criteria of moral agency to the moral agency I am proposing. Despite this problem, I will conclude this chapter that the essential difference can be what I call the ‘learning criterion’, which is what makes a minimal agent a morally accountable agent.

Floridi and Sanders begin by explaining that the systems that are of interest when it comes to artifacts are systems that can change. The concept of a transitional system is suitable to describe these systems. A transitional system has input, output and internal states. It also has transitional rules that takes input and yields output (an external transition) or the input changes its current state (internal transition) or a combination of them (2004, p. 356).

Since Floridi and Sanders’ aim is to say that an artificial agent can be a moral agent, they first consider what makes a ‘standard’ agent (the human Jan) an agent. Jan is an agent if he is “a system, situated within and a part of an environment, which initiates a transformation, produces an effect or exerts power on it”[1]. However, they find that this does not separate Jan from an earthquake and as the earthquake does not constitute a moral agent, they propose the criteria of interactivity, autonomy and adaptability (or “the three criteria”) to address this issue:

· Interactivity means that the agent and its environment can act upon each other.

· Autonomy means that the agent is able to change state without direct response to interaction.

· Adaptability means that the agent’s interactions (can) change the transition rules by which it changes state. This might also be viewed as some kind of learning from experience.

For example, a rock exhibits none of these criteria, a pendulum exhibits autonomy only and a thermostat is both interactive and adaptive but not autonomous[2] (2004, p. 357–9).

As the attentive reader may have noticed, we now have overlapping criterions. This is because Floridi and Sanders’ reluctance to define agency. Since they want to define moral agency for an artifact, an intuitive way would be to first define agency and then define what is required for an agent to be a moral agent. Although it might be hard to define agency, but I still believe it is the best way, and as Barandiaran et al has shown some progress in that field, let us continue on that path. Let us then compare Floridi and Sanders’ criteria for moral agency with the criteria for minimal agency (Individuality, Asymmetry, Normativity) despite this overlapping problem.

Yet again, individuality seems to be assumed by Floridi and Sanders, as done by others. I also note two distinct differences between these two sets of criteria. The first one is that Floridi and Sanders do not seem to be ruling out arbitrary or random behavior. None of the three criteria address any kind of rationality or predictability, only a certain level of complexity. This is in violation of the normativity condition which explicitly ruled out random and arbitrary behavior. From Floridi and Sanders’ standpoint of moral agenthood this is not as problematic as it might seem since for a morally accountable agent it does not matter why the agent acts or how it decided to choose this action, as long as it is a moral act. But for my argument this is problematic, because if the artifact does not meet the normativity condition, it is not even an agent and by implication cannot also be a moral agent. Floridi and Sanders instead bypasses the definition by directly saying that at a specific level of abstraction (LoA), if the artifact meet these criteria, at that LoA it is a moral agent.

The second difference is that Floridi and Sanders also describe something Barandiaran et al does not, namely the ability for the agent to ‘change its transition rules’; adaptability. The minimal agent can handle different inputs from the environment, but Barandiaran et al do not require the agent have to be able to update these transition rules — making it react differently the next time it receives the same input. This is what Legg and Hutter was referring to when defining machine intelligence, namely how ‘intelligent’ the agent is, and a higher level of intelligence requires that the agent is able to update its beliefs about the world and the better it does this, the better it can achieve its goals. If the agent cannot do this, it behaves as in the saying “Insanity is doing the same thing over and over and expecting a different result”[3].

Now, Floridi and Sanders say that the adaptability criterion “ensures that an agent might be viewed, at the given LoA, as learning its own mode of operation in a way which depends critically on its experience” (2004, p. 358). Floridi and Sanders spell this out as:

(i) It responds to new situations, on the basis of current information,

(ii) It could have or did change its behavior during the action and

(iii) It was not just following predetermined instructions but changed its current heuristics for action. (2004, p. 364)[4]

If this is correct, then what separates the minimal agent criteria from the moral agent criteria is the fact that the latter one adds the learning ability ‘adaptability’, the ability for the agent to ‘update its current heuristics for action’ but does not exclude arbitrary or random behavior. If this ‘learning criterion’ is appended to the minimal agent definition, this definition would pass the three criteria for moral agency according to Floridi and Sanders except that it would automatically rule out entities with random or arbitrary behavior since according to the minimal agency definition, that is not an agent. Intuitively, this makes sense. If an artificial agent is controlling the car in pedestrian crossing, and that agent meets the learning criterion, evaluating whether that agent is a morally accountable agent is similar to how we evaluate a human in the same position. To determine whether either of the two types of agent will brake in time depends on whether the agent has the concrete freedom to perform the act of braking.

A problem that Floridi and Sanders face concerning the difference between the two definitions of a moral agent is this: it is not as clear what should be considered an agent according to Floridi and Sanders’ levels of abstraction (LoA). As noted several times, they have a reason for this, they consider defining agency too hard. The advantage of their approach is that they are not bogged down by trying to define agency but can from a specific LoA identify a moral agent. However, this runs the risk of just postponing the problem to a later stage in the analysis. At a specific LoA, it seems that for example the individuality condition needs to be addressed somehow, otherwise you leave too much room for different interpretations of what actually is part of the system and what is not, and then it seems like questions about what is referred to by ‘agent’ will reappear at the later stage. And I do not think these questions are trivial. If you, for example, have a pragmatical aim of trying to mitigate the result of moral actions by artifacts, a vital part of this includes a definition of what is the source of the action. This is because if you do not know the boundaries of the entity that is the source of the act, what are you referring to? Is it one thing or many things? Also, this seems to invalidate much talk of agency that do seem important in moral discussion and I am not convinced that replacing discussions about agency with LoAs is really as clear and informative as Floridi and Sanders believes.

Notwithstanding this, the learning criterion seems to be aptly described by the difference between a directly and indirectly programmed artifact. The directly programmed artifact cannot update its transition rules, and thereby have the potential to behave differently when confronted with the same input and be in the same state while the indirectly programmed artifact can. Henceforth, the ‘learning criterion’[5] is the ability for an agent to relevantly update its transition rules where ‘relevantly‘ is meant to reflect Floridi and Sanders’ conditions (i), (ii) and (iii) above, which indicate how the agent adapts in a way that is relevant for that specific situation.

I suggest the following definition of moral agency:

An entity is an agent if and only if it meets the three conditions Individuality, Asymmetry and Normativity.

An agent is the morally accountable agent if and only if it
a) is the agent that directly caused the moral outcome and
b) meets the learning criterion

This definition means that the moral agent does not behave in a random or arbitrary manner, because if it did, it would not even be an agent. This definition also explicitly states what is meant by calling an artifact an agent upfront and does not postpone important distinctions to a later stage. Except for these differences, this definition is equivalent with Floridi and Sanders’ definition of an accountable moral agent and should be as valid.

Still, there are some that claim that consciousness is implied by agency or that artificial agency is not possible because the agent cannot be free enough, and therefore I will in the next chapter address two of those advocates. This will also give me an opportunity to expand on what is meant by being free and a counterargument to why indirect programming would enable concrete freedom.

Comments:

I realize now that I haven’t really tested this definition against any examples/counterexamples to see if it is useful. I guess that is part of the fact that the scope of this paper is too big. Maybe this definition should more be seen as a first draft. My guess is that these definitions will become quite important in the future, when we have many “intelligent” machines out in society, in varying degrees of complexity and moral relevance.
Next part is here!

Footnotes:

[1] Jan is an agent at this specific “level of abstraction”, it is not a definition.

[2] As viewed at the level of abstraction from a video camera over a period of 30s. (2004, p. 359).

[3] Implying that insanity is lack of intelligence.

[4] (i) along with the quote that “they could have chosen differently because they are interactive, informed, autonomous and adaptive” (p366) suggest that being “informed” is a fourth criterion. Since they only mention three criteria, Floridi and Sander most likely would not concur. Instead, being informed might be captured by their interactivity criterion since ‘to respond to input’ presupposes input from the environment but this is somewhat contradictory since interactivity simply says that a moral agent can interact with its environment and does not say anything about it whether it has interacted with the environment before, i.e. has received information. One could argue that as soon as it has received one input, it has been informed, but I think that it is more reasonable that to say that it also needs relevant information. How much turns on this is hard to say, but I do not find it relevant for the argument I make here. The exact quote is: “Interactivity means that the agent and its environment (can) act upon each other. Typical examples include input or output of a value, or simultaneous engagement of an action by both agent and patient — for example gravitational force between bodies.” (2004, p. 357)

[5] Although it is Floridi and Sanders’ adaptability criterion, except that it is used differently.

--

--

H R Berg Bretz

Philosophy student writing a master thesis on criteria for AI moral agency. Software engineer of twenty years.