Use ELU or SELU activations;
…ever, in general the performance of the networks trained with standardized data are slightly worse. The only exception is for Thresholded ReLU, where the results have improved significantly.
…ved accuracy, which means that there is overfitting. Widely used ReLU demonstrated average results. The clear winners here are ELU and SELU activations.