And this strategy doesn’t need to be implemented perfectly at all
I agree that the meta problem is different from the object-level problem.
Paul Christiano
23

I’m trying to imagine what this initial strategy would actually look like, and failing. What I’d REALLY like to see (if you decide to make this question a priority) is a sketch of that: a human training the first A on a set of problems that make it plausibly solve the recursive meta-programming at all levels in an aligned way, even if only approximately.