PyIDF: Diversity of experiences in Reinforcement Learning

Kostya Kanishev
Sep 13, 2019 · 11 min read

In Reinforcement Learning, an agent learns from interaction with its environment. The agent’s goal is to arrive at a behavior policy that maximizes the expected reward. Intuitively, an agent should learn from a diverse set of experiences, but how one can quantify and ensure this “diversity”? At Imandra, we’ve developed a novel technique for exploring algorithm state-spaces called Region