Thanks for the nice post.


You are correct. A3C typically is used in simulation environments only because it becomes infeasible to simultaneously run multiple robots at once. It is also not sample efficient enough to be used in real-world environments, since it would take literally thousands of episodes per worker to converge to a meaningful policy.

Show your support

Clapping shows how much you appreciated Arthur Juliani’s story.