You are right that A1 ought to be able to win using only a powerful computer, since that computer suffices to do the recursive bootstrapping.
But this reduction is going to produce wild-looking strategies, so if you do the reduction you can’t dismiss strategies because they look wild.
if you just had a very powerful computer, you could become smarter and wiser until you are smarter than the superintelligence you are overseeing.
For example, in the setting you described, we could engage in a very extensive process of reflection. Are you expressing skepticism about whether even extensive reflection can take us to arbitrarily high capacities, or are you expressing skepticism about whether the bootstrapping in this post can compete with that kind of extreme reflection? (I would have assumed the latter, but in your comment you reduce to the case of a human with an arbitrarily powerful computing environment.)
After becoming smarter, we hope for the best. Ideally you have some way to inspect the thought process of the superintelligence you are overseeing, at which point I think it is very likely you can defend yourself from attacks. If you can’t solve the informed oversight problem, then I agree that it is less clear that you are safe even once you are smarter than the superintelligence you are overseeing.