Techspace
Published in

Techspace

Normalization Techniques in Deep Neural Networks

We are going to study Batch Norm, Weight Norm, Layer Norm, Instance Norm, Group Norm, Batch-Instance Norm, Switchable Norm

Why do we need Normalization ?

Batch Normalization

this image is taken from https://arxiv.org/pdf/1502.03167.pdf%27
ϵ is the stability constant in the equation.

Batch norm alternatives(or better norms) are discussed below in details but if you only interested in very short description(or revision just by look at an image) look at this :

This image is taken from https://arxiv.org/pdf/1803.08494.pdf

Weight Normalization

Note: Mean is less noisy as compared to variance(which above makes mean a good choice over variance) due to the law of large numbers.

Layer Normalization

i represents batch and j represents features. xᵢ,ⱼ is the i,j-th element of the input data.

Instance(or Contrast) Normalization

This image is taken from https://arxiv.org/pdf/1607.08022.pdf

Group Normalization

Sᵢ is defined below.

Batch-Instance Normalization

the value of ρ is in between 0 and 1.

Understanding from above, a question may arise.

Switchable Normalization

References

--

--

This is the publication for Techspace USICT, sharing blogs, tutorials and information.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aakash Bindal

I am Computer Vision and Image Processing enthusiast. I like to learn the core of every algorithm which is basically mathematics.