Effect of batch size on training dynamics
This is a longer blogpost where I discuss results of experiments I ran myself.
In this experiment, I investigate the effect of batch size on training dynamics. The metric we will focus on is the generalization gap which is defined…