Sep 5, 2018 · 1 min read
Hi, thank you for reading my article!
DQN has the loss function that is exactly same form as classical Q learning, which calculates TD errors.

Both of DQN and the classical Q learning use predicted values at the next state and the immediate reward. So, basically both do the same thing.
