Layer 10: Transformer: Grid Generator and Sampler. This logic is implemented in
spatial_transformer.py . This layer outputs transformed images with the same dimensions as the original one (32x32x1) with affine transformation applied to it (e.g., a zoomed in or a rotated image).
The STN is a differentiable module which can be injected in a convolutional neural network. The default choice is to place it right “after” the input layer to make it learn the best transformation matrix theta which minimizes the loss function of the main classifier (in our case, this is IDSIA).