Understanding RMSprop: A Visual Guide

James Moon
3 min readAug 24, 2023

RMSprop, which stands for Root Mean Square Propagation, is an optimization algorithm derived from Rprop (Resilient Propagation). It’s used to adjust and update the parameters of your model during training. Geoffrey Hinton first introduced RMSprop in one of his Coursera courses, and since then, it has become a popular choice for training deep neural networks.

How does RMSprop work?

RMSprop adjusts the learning rate of each parameter, making it smaller for parameters with consistently large gradients and larger for parameters with small gradients. This dynamic adjustment of the learning rate can help overcome challenges like slow convergence or divergence in deep networks.

Why use RMSprop?

  1. Adaptive Learning Rates: RMSprop dynamically adjusts the learning rate for each parameter based on the recent magnitudes of their gradients.
  2. Avoids Vanishing/Exploding Gradients: The adaptive learning rates can prevent updates from becoming too large (which could lead to exploding gradients) or too small (which could lead to vanishing gradients and slow convergence).

--

--

James Moon

Baylor College of Medicine PhD Candidate | Quantitative & Computational Biology | Data scientist | Linkedin: shorturl.at/jBCDM