Rotation Equivariant Convolutional Neural Network

Introduction

Anuj Ahuja
Intel Student Ambassadors
4 min readJun 4, 2019

--

Convolutional neural networks have gained immense popularity over the past decade mainly because they have proven to be very powerful models of sensory data such as images, video, etc. One primary reason behind this is that convolutional layers used in a deep network are translation equivalent. By translational equivalence, I mean that if an input image is shifted in any direction and fed to the deep net it is equivalent to feeding the original non-shifted image to the deep net and then shifting the resulting feature maps. In other words, feature extraction is independent of spatial position. However, this is not the case for other transformation such as rotation. In the following article one of the major drawbacks of CNN i.e, their incapability to being rotationally equivalent is discussed. The article has been divided into three sections. Firstly, an overview of the problem is presented, next the need for a solution to the same is discussed and lastly, a solution is analyzed.

Overview

For example, if the input image is rotated by a certain degree and passed through a deep network then the resulting feature maps obtained are different from the feature maps obtained by passing the original input image through a deep network and then rotating the feature maps. This implies that standard convolutional layers are not rotationally equivalent. Although, a deep CNN can be made rotationally equivalent by making the network explicitly learn the different feature maps at rotated versions of the input image. But in this approach, CNN is compelled to learn the rotated versions of the same image which increases the risk of overfitting and introduces a redundant degree of freedom.

The need for rotational equivalence?

Today CNNs are the go-to solution for any task related to image processing, feature learning, etc. For many types of images, it is desirable to make feature extraction orientation independent as well. Typical examples are biomedical microscopy images or astronomical data which do not show a prevailing global orientation. Therefore it is a must that models designed for these data perform well even if there is a change in the orientation of the image given to the network. Thus, there must be some way in which this drawback of CNN could possibly be overcome.

Solution

Convolutional networks can be generalized to exploit larger groups of symmetries, including rotations and reflections. The possible way of achieving this is by using a new type of convolution layer called Group Convolution Layer instead of standard planar Convolution layer.

A Group Convolution Layer can be understood by comparing it with planar Convolution layer, in planar Convolution layer we translate the filter and compute the inner product similarly a Group Convolution Layer can be viewed as a process in which we transform/rotate the filter and then compute the inner product. This allows the network to learn feature maps associated with different rotated versions of the input image in a single pass.

The figure above shows a G-Convolutional layer which is equivariant under rotation. As you can see from the figure in this process we transform/rotate the filter and then compute the inner product. Similar to G-Convolution there is G-pooling layer also.

Implementation

The G-Convolutional layer can be implemented with the GrouPy [Cohen & Welling, 2016] which is a python library that implements group equivariant convolutional neural networks in Chainer and TensorFlow and supports other numerical computations involving transformation groups.

The implementation of the CNN was run on the Intel AI DevCloud. The Intel architecture on the DevCloud allowed to train the network efficiently.

Example code:

Code Credits: Taco Cohen and Max Welling Github Link

Benefits of Group Convolution Layer

  • Can simply be used by replacing standard convolution layer in a deep network.
  • Fast learning process
  • Allows the network to learn feature maps associated with different rotated versions of the input image in a single pass
  • Negligible computation overhead
  • Improves accuracy

References

  • Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable filters for rotation equivariant cnns. arXiv preprint arXiv:1711.07289, 2017.
  • Bastiaan S. Veeling, Jasper Linmans, Jim Winkens, Taco Cohen, and Max Welling. Rotation Equivariant CNNs for Digital Pathology arXiv:1806.03962.
  • Taco Cohen and Max Welling. Group equivariant convolutional networks

--

--