See more
The non-linearity in most cases should be a rectified linear unit (ReLU) or one of its variant. Batch Normalisation can be applied after the non-linearity to avoid overfitting.