I think that the feasibility of avoiding goal-directed agents entirely is questionable.

1 min readDec 23, 2015

I think that the feasibility of avoiding goal-directed agents entirely is questionable. An approval-directed agent has to be powerful enough to meaningfully operate with the model of the overseer, which is very challenging if the overseer is a human. I’m not sure constructing such an agent “manually” (without using self-improvement) is feasible within the relevant timeframe. In other words, if it is much easier to arrive at superintelligence by goal-directed agents then it’s unlikely that a friendly projects which avoids goal-directed agents will outrace competing unfriendly projects.

Written by Vanessa Kosoy