Setting up Our Model with Look-Ahead
Nov 4 · 6 min read

Last week we went over some of the basics of Temporal Difference (TD) learning. We explored a bit of the history, and compared it to its cousin, Q-Learning. Now let’s start getting some code out there. Since there’s a lot in common with Q-Learning, we’ll want a similar structure.
This is at least the third different model we’ve defined over the course of this series. So we can now start observing the…

