Project: Legal accountability in AI-based robot-agents’ user interfaces.

Roland Pihlakas, 2. November 2018

Head of the C-3PO, a protocol droid intended to assist in etiquette, customs, and translation. Among other forms of communication, C-3PO translates R2-D2's machine speech.
Jens Johnsson,

Submitted to the The Centre for Effective Altruism as a grant application.

Publicly editable Google Doc with this text is available here for cases where you want to easily see the updates (using history), or ask questions, to comment, or to add suggestions.


Determining why exactly a particular decision was made by a robot-agent is often difficult. But the whitelisting-based accountability algorithm can greatly help by easily providing an answer to the question who enabled making a particular decision in the first place. Whitelisting enables the accountability and humanly manageable safety features of robot-agents with learning capability.

The central subject of this project is legal accountability in artificial intelligence. We are going to show who can justifiably and fairly be made responsible for the actions of artificial agents, and how can whitelisting help both artificial intelligence developers and legislators in making sure that we will have as few surprises as possible. In other words, the project will research the possible ways to control and limit the agents’ actions and learning from a legal point of view, by utilising specialised, humanly comprehensible user interfaces, resulting in clearer distinctions of accountability between the manufacturers, owners, and operators.

The standards proposed in the project are applicable to agents at various capability levels. The agents operate literally as representatives of persons. The project entails designing human-manageable and legally binding structures, consisting of: 
1. Agent capability-based whitelist of permissions / clearances with user accountability and authorisation levels; 
2. Explicit goals. 
Agent’s subgoals, actions, and learning is restricted by the whitelist. The project includes technical directions for achieving the above mentioned standards.

The project is a continuation of the research I started during my PhD years. A similar topic and approach is now discussed by Estonian lawmakers (and related laws are also under discussion in various other countries), but apparently without the knowledge of the technical details. Some of the solutions they currently consider are non optimal and contain certain confusions (which are also explained in this project text). Mostly because the considered legal solutions would provide agents with too much freedom, and too little accountability of the persons the agents are representing. Their project is under risk of failure unless they get help in technical planning. Helping to further the Estonian robot legislation project will also provide an useful example internationally for the other countries working towards their own related legislations.

The goal is to share the results of this project with the lawmakers on an ongoing basis. Additionally it is necessary to improve the public discussion of these topics as well, in the format of seminars (and not just press conferences), therefore engaging more lawyers, computer scientists, economists, and other interested people from the general society and enabling them to learn and chime in. This would add enough pressure on the lawmakers to cooperate.

The project is concerned only with the topic of accountability and responsibility, and not with other legal aspects of regulating robot-agents (for example: property laws, copyright laws, public morality, data protection, etc).

Below follows a description of the concepts and topics that will be covered by the research and on which the corresponding standards and prototype will be based.

A brief overview of Estonian lawmakers’ “Kratt” report in relation to accountability.

Below follows a short list of some of the accountability-related main motivations and concepts of the lawmakers with which I agree with.

  • The robot-agent is treated as a representative of persons.
  • The robot-agent itself should not be held responsible.
  • Use of the concept of declaration of will.
  • The concept of “pocket money” (essentially a quantified declaration of will).
  • The need for a central robot registry.

Some of the key concepts in the current project:

  • The robot-agent / representative:
    An autonomous device operating in a physical or software realm in the interests of his user or owner. The robot-agents are interpreted as being representatives of people or legal persons and having limited legal capacity.
  • Permissions / clearances:
    - The clearances and permissions represent the diligence responsibilities of the manufacturer, owner, or user of the robot-agent.
    Clearances are given based on the capabilities and competences of the robot-agent.
    - Permissions are given based on the preferences of the user. Enabling any permissions assumes that the user understands and accepts the corresponding responsibilities.
  • Layered clearances / permissions:
    All clearances and permissions that can be given are available only when these clearances and permissions were enabled by the upper layer clearances or permissions. For example, the owner of the robot cannot enable or turn on permissions that the manufacturer did not enable.
  • Layers of responsibilities:
    Enables analysis on who enabled the clearances or permissions for the next layer of users to turn on particular permissions.
  • Whitelisting:
    A way of representing the permissions and clearances in such a manner that anything that is not permitted, is automatically and implicitly considered forbidden. This principle has already found wide use in responsibility-laden areas of life, including the management of human behaviour.
  • Capabilities and competences of the agent:
    Sensory and mechanical capabilities, reasoning abilities, and trained competency of the robot-agent.
  • Task-directed goals:
    The goals represent the will of the manufacturer, owner, or user of the robot-agent. Task-directed goals are defined thus that upon reaching some pre-specified threshold of a measure, the task is considered more or less complete and the robot will not pursue this goal further. This is an important property of safe artificial agents. This strategy can be combined with the approach of diminishing returns which additionally enables the property of early stopping in case achieving more perfect results would turn out to be unfairly costly in other aspects. In the proposed framework the task-directed goals always have a lower priority than permissions and clearances. In the other words, goals and actions are restricted by missing permissions and clearances.
  • Accountability mechanisms:
    - Whitelisting.
    - Layers of responsibilities and clearances / permissions.
    - The black box for logging purposes.
    - The legal user interface.
  • The black box:
    A secure system component for gathering logging data about who enabled particular clearances or permissions, who specified any particular task-directed goals, which permissions or clearances enabled the subsequent particular decisions made by the robot.
  • The legal user interface:
    Contains information about clearances and permissions, goals, the legal consequences of the aforementioned, and any related training materials.
  • Learning:
    Adjustment of the internal parameters of the robot-agent during its operation, based on data gathered from communications, from the environment, or through self-reflection.
  • (Unlimited) optimisation goals:
    - Are a special kind of goals requiring separate and special treatment.
    - Goals which are defined thus that there is no pre-specified (effective) threshold upon meeting of a certain measure for the goal to be considered complete. Economic growth or company profits can be considered one of the examples of such measures. Unlimited optimisation goals give rise to the most urgent problems in AI safety. Similar problems have been well known for a long time in economy and sociology as the Goodhart’s law and other related laws (for example, in human resources management, utilising key performance indicators has similar issues). 
    - Therefore the current project proposes focusing mostly on task-directed goals in such a manner that unlimited optimisation goals are treated only as an additional measure of the lowest priority among the goals of robot-agents (for example the factory robot may be given an (additional) optimisation goal of cleaning up after themselves after finishing their main work). Some of the unlimited optimisation goals can be re-defined as recurring task-directed goals. In any case, both task-directed goals and unlimited optimisation goals always have, in the proposed framework, a lower priority than permissions and clearances. In other words, actions and goals are restricted by missing permissions and clearances.

The key topics that will be covered:

Accountability mechanisms:

  • Layers of responsibilities and permissions.
  • Whitelisting — enables limiting unforeseen or extra degrees of freedom in the robot-agents planning, learning, and behaviour.
  • A black box gathering logging data:
    - About who enabled a particular permission / clearance (without turning it on yet)
     — Data about who enabled the clearances or permissions in the upper level (for example, during manufacturing), so that the next layer of users had the option to turn on particular permissions.
    - About who turned on a particular permission / clearance (either for the agent or for the next layer of users).
    - About which permissions or clearances enabled particular decisions of the robot.
    - Who gave a particular goal to the agent.
  • The legal user interface.

Elements of the legal user interface:

  • Clearances and permissions.
    - Clearances are given based on the capabilities and competences of the robot-agent.
    - Permissions are given based on the preferences of the user.
    - All clearances and permissions that can be given are available only when these clearances and permissions were enabled by the upper layer clearances or permissions. For example, the owner of the robot cannot enable or turn on permissions that the manufacturer did not enable.
  • Provides information about the legal consequences of clearances and goals.
  • Standards for the training texts and videos which would be available directly in the same UI, tied to particular choices available in this UI. This results in the human obtaining their competence right there, where the “action” takes place without a need to remember long lists of facts for future use (as people are poor at remembering lists).

Examples of similar principles already in use:

  • Competence-based permissions of public sector officials (in contrast to the private sector, where everything that is not explicitly forbidden, is permitted — in the public sector, everything that is not explicitly permitted, is forbidden. Due to high demands on responsibility, the permissions are given based on specific certifications of competences, and certifiers in turn have their associated responsibility).
  • Similarly, in the proposed framework, the robot-agents are interpreted as being representatives of people or legal persons. In other words, we would have “Robot-agents representing the people, by the people, for the people”.
  • Smartphone apps ask for permissions during installation or during use, in case they need additional permissions. This is built in to the smartphone operating systems, so the apps can not even go around this standard. Note that there are no questions about blacklisting any app functionalities and access, only questions about whitelisting of app functionalities and access.
  • Analogously, Facebook apps ask for permissions regarding data access and posting during registration.
  • Desktop operating systems manage access to files in the hard drive again by utilising whitelist-based permissions, not by blacklists.

Potential corner cases and misuse cases to pay attention to:

  • The permissions must be sufficiently generic, not too detailed. For example, enabling many minute permissions could eventually have an emergent effect of indirectly enabling some composite actions that may have unintended negative consequences.
    - As a particular example, let’s assume that the robot-agent does not possess the capability to know the meanings of words. Still, the robot is given permission to use particular letters in its output without regard to the general verbal incompetency of the robot. The result is that the robot may combine the letters into dangerous words, since it is incompetent in foreseeing the results of these words.
  • When the more generic permission cannot be lawfully enabled because of the robot’s incompetency, then no permissions should be given in many related minute aspects either, as that would be a negligence leading to a possibly dangerous situation.
  • Mostly such corner cases are the responsibility of the manufacturer during product development and testing. Such issues, arising from potential interactions between many minute but complementarily interrelated permissions, should not become a robot-agent’s owner’s or user’s daily concern.


  • Preventing illegal weaponisation of robot-agents.
  • Avoiding legal loopholes in responsibilities.
  • Avoiding the suffering of humans and other living beings.
  • Avoiding excessive court case costs by the harmed parties.
  • Avoiding excessive profiting by the robot-agent owners at the expense of other people.
  • A good, technically competent and justified legal framework enables and improves innovation and an economically beneficial deployment of robot-agents, whereas a confused and socially stratifying legal framework with loopholes would lead to mistrust of technology.

Potential interested parties to cooperate with:

  • Policy / lawmakers.
  • Lawyers.
  • Society, via media publications and public discussions.
  • Insurance companies.
  • Various organisations issuing safety certificates.
  • Military (as many of the robot-agents could potentially be used as illegal weapons).
  • Future robotics agencies.
  • Manufacturers.
  • Consumer Protection Agency.

Related topics in AI and computer science:

Things that need to be explained to lawmakers / policy makers:

  • The robot must naturally operate only in an environment or in the situations corresponding to its sensory and mechanical capabilities, reasoning abilities, and trained competency. In other environments it must detect its incompetency automatically and leave, stop operations, or shut down, whenever possible. Additionally and no less importantly, the persons setting up or enabling such a scenario should be held responsible for negligence.
  • The insurances may help in covering the damages, but in any case the responsibility should lay on the users, owners, or manufacturers of the robot-agents. This is important in order to incentivise these parties towards avoiding side effects and damage caused through the activities of a robot-agent operating in their interests. The robot-agents can be programmed in such a way that they cause a minimal amount of “autonomous” mistakes — either because of mistakes in planning or learning. Both of these autonomous aspects of robot-agents can and should be limited without any unjustified loss of usefulness of the robot-agents.
  • Confusion about the ability of the robot-agent to explain its choices:
    There is a difference between the algorithms / robot-agents capability to explain their decisions, and a strong autonomy / free will / responsibility. The former does not imply the latter. Actually the research indicates that explainability / transparency may even come at the cost of reduced reasoning power.
    - Capability for self-reflection and consciousness are entirely different concepts. The first one is a property that can be programmed. Even more, it has been known and used both in computer science and also in practical computing for a long time. (It was one of the first capabilities that was developed in early AI research, back in the golden 1960’s. See Wikipedia’s entry on SHRDLU for a famous demonstration conversation with a self-reflecting AI. Regardless of its impressive capabilities that AI was far from being strongly autonomous.) Therefore it should not entail any additional special legal privileges or moral dilemmas.
    - Humans aren’t very good at explaining their decision-making either.
  • Confusions with the aspect of ongoing learning of the robot-agent during its operation (outside of manufacturing, training, or upgrading):
    Autonomous learning can also be limited / controlled by tolerances specified in whitelisting and therefore the capability to learn does not mean that it will break accountability. We should not accept use cases where it is morally complicated to assign responsibility to anyone else than the robot itself.
    - There is a difference between learning and data gathering / data exchange. Gathered / exchanged data of course needs to be either validated, or should not have safety related consequences, or if both of the aforementioned assumptions fail, there should be someone held responsible for transmitting the invalid and dangerous data over a critical and trusted channel.
    - A claim that autonomous learning makes robot-agents unpredictable is in fact exactly comparable to a claim that the ability to reason and plan makes robot-agents unpredictable. It is already agreed that robots should not be able to do too much thinking of their own. Therefore it is natural and obvious that not only autonomous planning, but also the autonomous learning can be and should be limited by various technological and legal means, for which the general strategy of whitelisting is a perfect match.
    - The insurance companies do not have to ponder about how to decide the pricing of insurance in case of learning robot-agents, because:
     — Humans also learn, but insurances are still able to decide the pricing policies.
     — The learning of robot-agents should be constrained by pre-specified tolerances and therefore not have any significant effect on the safety of their behaviour. The learning done by robot-agents should only affect the optimality of their actions. Any safety related modifications need to be thoroughly tested / verified by the manufacturer, there is no way around it. It would not be possible to guarantee safe behaviour just by self-reflection and static analysis after any new data is learned in an unrestricted manner. Even code updates need to be tested and cannot be proven to be correct just by self-reflection or static analysis, the more so this applies to anything that is learned from external environment, because learned information is fuzzy and does not yield to logical analysis, like code does.
  • Confusion about whether unusual situations mean that the robot can be naturally expected to behave in unexpected ways.
    There are strategies in AI by which the behaviour in unusual situations can be limited / controlled:
     — First, anomaly detection algorithms (an existing and practice oriented field in AI).
     — Secondly, impact minimisation.
     — Finally and most importantly, there is a central principle that follows from the current proposal: the robot-agent should not operate in environments where it is not permitted / cleared to operate in (potentially because it is not capable, competent, or trained to safely operate there). In the case where such an environment is still encountered, the robot-agent should leave, stop operations, or shut down entirely.
  • Confusion about the guilt capability.
    The AI-based agents may be very well able to detect that their actions are dangerous or even damaging. By nevertheless continuing their activities in such cases they simply make a compromise by deciding that some other goals are of higher priority. Just as humans do. So the ability to detect wrong or bad actions cannot be the basis for excluding robot-agents from criminal responsibility. This exclusion has to be made on some other grounds and preferably on explicit grounds. Note that I am all in favour of keeping humans and legal persons responsible. But it has do be done correctly and without loopholes.
  • Comparison with the animals.
    Whether this comparison is valid depends very much on the implementation. In the case of utilising the whitelisting principle, one may argue that the teaching process of agents is almost exactly the opposite to training animals. In the case of animals they have many dangerous instincts that may manifest differently in different situations, and therefore we have to train the animals for each context and for each manifestation of their instincts separately, resulting in a suppression of a particular instinct in a particular situation. And so on for each slightly different situation. Which is essentially equivalent to the blacklisting principle. In comparison, in the case of whitelisting, no infinite lists of unsafe actions are needed. Instead, only safe actions together with safe contexts are specified.

Questions for future research.

  • GDPR (privacy and data protection laws).
  • Bias and discrimination in AI.
  • The need for explainability versus the motive for intellectual property protection (agents explaining their reasoning processes may thus reveal strategies and algorithms that are considered intellectual property).
  • Live coordination between humans and robot-agents.


Determining why exactly a particular decision was made by a robot-agent is often difficult. But the whitelisting-based accountability algorithm can greatly help by easily providing an answer to the question who enabled making a particular decision in the first place. Whitelisting enables the accountability and humanly manageable safety features of robot-agents with learning capability.

On top of that, it helps to avoid unsolvable philosophical discussions in courtrooms about whether a particular agent acted with “strong autonomy”, willfully, or had guilt capacity (both of the last two have been, in a sense, true for a long time already). Such discussions would sink any law trying to regulate artificial agents.

See also.


I would like to thank Kea Kruuse and Eero Ränik for various very helpful questions and comments.

Thanks for reading! If you liked this post, clap to your heart’s content and follow me on Medium. Do leave a response and please tell me how I can improve.

Connect with me —

Skype | Facebook | LinkedIn | E-mail