Everything You Need to Know about Gradient Descent Applied to Neural Networks

Explanation // Versions // Algorithm steps // Optimization techniques

Jaime Durán

Published in

yottabytes

11 min readSep 19, 2019

Also available in Spanish | También disponible en español

Introduction

A few weeks ago I was nominated to give a small talk about Machine Learning to my teammates; from scratch. The main difficulty was to condense a lot of different topics cutting out many others in barely an hour and a quarter (with no victims!). And in my opinion setting that goal at the beginning was a big mistake. I don’t know at which moment I decided to deepen a little in Deep Learning, trying for instance to explain in a minute how a neural network is trained; something which drained my colleagues’ desire to live. So that day I assigned myself the task of writing an article about what I tried to explain so badly, but with no time limit….

First I’ll try to tell you how Gradient Descent works in detail, clarifying everything that at some point was somewhat blurry for me. Then I’ll explain some aspects to consider when you’re training a deep neural network. And finally I’ll talk about the existing techniques to optimize the algorithm. Here we go!

Everything You Need to Know about Gradient Descent Applied to Neural Networks

Explanation // Versions // Algorithm steps // Optimization techniques

Introduction

Written by Jaime Durán