And we have no way at all to communicate which of the possible generalizations it should use.
Turning reflection up to 11
Paul Christiano
21

Shouldn’t we in theory be able to use reinforcement learning to communicate this? That is, after the supervised training, query Arthur on various (Q,9) and give it appropriate feedback until it starts to give answers that we want, then move on to (Q,10), and so on. It seems to me that Arthur should catch on quickly as long as its prior assigns some non-negligible weight to the intended generalization.