Feb 23, 2017 · 1 min read
In a comment, you mentioned that max_episode_length is a timeout if an episode has gone too long. I think it’s not being used for that case. Shouldn’t this part of the code be,
if d == True or episode_step_count == max_episode_length — 1:
break