Setting up Our Model with Look-Ahead

James Bowen
Nov 4 · 6 min read

Last week we went over some of the basics of Temporal Difference (TD) learning. We explored a bit of the history, and compared it to its cousin, Q-Learning. Now let’s start getting some code out there. Since there’s a lot in common with Q-Learning, we’ll want a similar structure.

This is at least the third different model we’ve defined over the course of this series. So we can now start observing the…

Keep the story going. Sign up for an extra free read.

You've completed your member preview for this month, but when you sign up for a free Medium account, you get one more story.
Already have an account? Sign in

James Bowen

Written by

Author of Monday Morning Haskell (http://mmhaskell.com)

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade