Christoph DanninStanford AI for Human ImpactPolicy Certificates and Minimax-Optimal PAC Bounds for Episodic Reinforcement LearningDesigning reinforcement learning methods which find a good policy with as few samples as possible is a key goal of both empirical and…Aug 16, 2019Aug 16, 2019