What pooling brings is just translational invariance, which doesn’t really cover the rotational…
Gökçen Eraslan

Gökçen Eraslan: I guess the language of the article is misleading here. True pooling just brings in translatory invariance. Rotational is acquired by either rotating the kernel or the feeding rotated input images to the network.

To answer your second question, what I tried to mean is that pooling (along with variation of dataset) brings into the network all sorts of invariances. Yes it doesn’t remove invariance. This is in effect to make the point that as a side effect it essentially also creates the network tolerant to rampant other invariances.

If we train a network to understand human face and somehow the network understood that faces contain 2 eyes, 1 nose and 1 mouth as feature vector then pooling actually loses the positional variety that eyes need to be on top, and mouth needs to be at the bottom.