Huang henry
1 min readNov 8, 2018

--

Assume the overestimation exists, I know that if all values overestimate uniformly, the optimal policy(greedy towards) remains the same as that with no overestimation.But why the overestimation is not uniform?

--

--