Aug 24, 2017 · 1 min read
what about when you see the norm of the gradients increase and loss seems to have converged? I would have expected that to be a bug but to my surprise its an expected behavior of NNs and CNNs?
what about when you see the norm of the gradients increase and loss seems to have converged? I would have expected that to be a bug but to my surprise its an expected behavior of NNs and CNNs?