This is a really nice dive into the optimizer, thanks for taking the time to write it! You may be interested in some interesting behaviors the optimizers have for low dimensional problems (like roots of a polynomial).

https://www.youtube.com/watch?v=Z-CiRcrJiKo

ADAM, GD, and RMSProp have very different behaviors even on this simple task!