
…sarial loss functions have been explored. Some of these seem more stable than others. I’d recommend Are GANs Created Equal? A Large Scale Study by Lucic et al., an excellent recent review of the GAN landscape, if you’d like to read more.
It is an extension of AdaGrad which tends to remove the decaying learning Rate problem of it. Instead of accumulating all previous squared gradients, Adadelta limits the window of accumulated past gradients to some fixed size w.