A new kind of pooling layer for faster and sharper convergence
Sahil Singla

I would like to see a comparison between this and vanilla softmax pooling, suppose a_1, a_2, a_3, a_4 are your pooled activations. Apply a linear function x → wx, where w is a parameter to be learned. Then take the softmax of the result. This specialises to max pooling for large w and average pooling for small w, but doesn’t require any reordering of elements.

