Jul 21, 2017 · 1 min read
When I run the Q-Table version I get “Score over time: “ between roughly 0.3 and 0.6. When I run the tensorflow version I get “Percent successful episodes: “ between 0.2 and 0.5 %. In what way do these results map to “solving” frozenlake? It’s very unclear to me, however noticing that in the tf version rList stops returning a 0 around the 750th episode looks very clear to me
