Deep Reinforcement Learning Demysitifed (Episode 2) — Policy Iteration, Value Iteration and Q…
Moustafa Alzantot

Nice article but getting error in both.

‘TimeLimit’ object has no attribute ‘P’ at env.P[s][a]]

TimeLimit’ object has no attribute ‘nS’

I guess you cannot access observation and action space like this now.

Any solution to first error ??

