Nowhere in this setup there is pose (translational and rotational) relationship between simpler features that make up a higher level feature.
Understanding Hinton’s Capsule Networks. Part I: Intuition.
Max Pechyonkin

Translation information is embedded in the location in the matrix itself ?

It is obvious that you can train a weight matrix that is sensitive exclusively to only some parts of the field of view!

For rotational, I believe there will be some filters that are more sensitive to rotated (by some degree) versions of other filters. If the higher features’ filters are more sensitive to these, the higher features are likely to be the whole of these rotated parts.

I don’t see that CNN cannot cope with rotational relationship. The problem with that is it has to mostly be “relearned”!

Unless I understood you wrong.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.