Trust Region Policy Optimization — Overview

Bruce Krayenhoff
Sep 3, 2018 · 1 min read

An overview of the Trust Region Policy Optimization (TRPO) — Paper/Reinforcement Learning algorithm.

I skip the hard math that hasn’t been required by subsequent algorithms inspired by TRPO. I also skip the Vine method, which hasn’t seen as much interest in the literature.