Trust Region Policy Optimization — Overview
Sep 3, 2018 · 1 min read
An overview of the Trust Region Policy Optimization (TRPO) — Paper/Reinforcement Learning algorithm.
I skip the hard math that hasn’t been required by subsequent algorithms inspired by TRPO. I also skip the Vine method, which hasn’t seen as much interest in the literature.