Learning Parameters

Let’s look at gradient descent with an adaptive learning rate.

Motivation for Adaptive Learning Rate

Before moving on to advanced optimization algorithms let us revisit the problem of learning rate in gradient descent.

Let’s digress a bit from optimizers and talk about the stochastic
versions of these algorithms.


Let’s look at two simple, yet very useful variants of gradient descent.

Gradient Descent is an iterative optimization algorithm for finding the (local) minimum of a function.

A quick look at some basic stuff essential to understand how parameters are learned.

  1. Multivariable Functions
  2. Local Minimum vs. Global Minimum
  3. Understanding The Gradient
  4. Cost or Loss Function
  5. Contour Maps

1. Multivariable Functions

Working Example



Biological Neurons: An Overly Simplified Illustration

A Biological Neuron Wikipedia

Akshay L Chandra

Deep Learning Research Assistant @ IIT Hyderabad.

