Deep Learning Best Practices (1) — Weight Initialization

Neerja Doshi

Published in

USF-Data Science

7 min readMar 26, 2018

Basics, weight initialization pitfalls & best practices

Motivation

As a beginner at deep learning, one of the things I realized is that there isn’t much online documentation that covers all the deep learning tricks in one place. There are lots of small best practices, ranging from simple tricks like initializing weights, regularization to slightly complex techniques like cyclic learning rates that can make training and debugging neural nets easier and efficient. This inspired me to write this series of blogs where I will cover as many nuances as I can to make implementing deep learning simpler for you.

While writing this blog, the assumption is that you have a basic idea of how neural networks are trained. An understanding of weights, biases, hidden layers, activations and activation functions will make the content clearer. I would recommend this course if you wish to build a basic foundation of deep learning.

Note — Whenever I refer to layers of a neural network, it implies the layers of a simple neural network, i.e. the fully connected layers. Of course some of the methods I talk about apply to convolutional and recurrent neural networks as well. In this blog I am going to talk about the issues related to initialization of weight matrices and ways to mitigate…

Deep Learning Best Practices (1) — Weight Initialization

Motivation

Written by Neerja Doshi