I basically agree. The questions are whether problems without a precise specification (potentially including the pursuit of human values) are fundamentally harder, and whether imitation/approval-direction is inherently a significantly weaker technique. I think this is not at all clear.
(I don’t really find the examples in this paragraph compelling as evidence about these questions.)
Note that explicit reasoning and domain-specific algorithms can play a role in either goal-directed or act-based systems. The game I’m playing is: understanding what constraints a given technique places on the goals that can be effectively pursued, and then thinking about how to build a system that effectively contributes to human values despite those constraints. (MIRI is playing a different game.)
For now, systems based on explicit reasoning or domain-specific algorithms don’t seem to be at risk of effectively pursuing inhuman values. This makes it hard to think about the usefulness of work like this, which was aimed at overcoming constraints that might be imposed by those kinds of systems. Over time this is likely to change.