David Krueger·Follow1 min read·Oct 15, 2016--1ShareHow can a reward function be aligned if it doesn’t recognize damaging behaviour?