Summary: GameGAN

Ideas from this summary are taken from the GameGAN paper.

May 6, 2019

Summary: Conservative Policy Iteration

Conservative Policy Iteration has 3 goals: (1) an iterative procedure guaranteed to improve a performance metric, (2) terminate in a “small” number of steps, and (3) find an “approximate” optimal policy. These three goals are hit by relying on a few assumptions…

Zac Wellmer

Apr 24, 2019

Summary: SimPLe

Ideas and figures from this summary are taken from Model-Based Reinforcement Learning for Atari(SimPLe).

Zac Wellmer

Feb 24, 2019

Summary: PlaNet

Deep Planning Network (PlaNet), is a model-based agent that learns a latent state dynamics model from images and takes actions…

Zac Wellmer

Feb 16, 2019

Summary: World Models

One of the core issues in Reinforcement Learning is sample complexity. Therefore it’s appealing to train RL agents in a…

Zac Wellmer

Jan 31, 2019

Summary: Learning Plannable Representations with Causal InfoGAN

Zac Wellmer

Sep 15, 2018

Summary: Value Prediction Networks(VPN)

VPN is a deep reinforcement learning architecture that mixes ideas from both model free and model based methods. Generally model based methods learn environment dynamics so as to predict real observations, however, VPN attempts to learn a dynamics model that…