Receptive Field in Deep Convolutional Networks

Reza Kalantar
5 min readJan 2, 2023

--

The receptive field of a convolutional layer in a neural network is the region of the input image that is used to compute the output of a particular neuron in the feature map. The receptive field is important because it determines the context or the information that the neuron has access to when making a prediction.

The receptive field of a neuron in a CNN is determined by the size of the kernel used in the convolutional layer and the stride of the convolution (Fig. 1). A larger kernel size or a larger stride results in a larger receptive field, which means that the neuron has access to more context or information from the input image. A smaller kernel size or a smaller stride results in a smaller receptive field, which means that the neuron has access to less context or information from the input image. The size of the receptive field can be adjusted to suit the specific requirements of the task at hand.

Fig. 1: Receptive Field in Neural Networks, kernel size (3x3) — Image created by the author

For example, consider the following image, which is a 5x5 grayscale image with pixel values ranging from 0 (black) to 255 (white):

255 255 255 255 255
255 0 0 0 255
255 0 0 0 255
255 0 0 0 255
255 255 255 255 255

Now suppose we have a CNN with a single convolutional layer that has three 3x3 filters (neurons). The receptive field of each of these filters is a 3x3 region in the input image. For the first filter, the receptive field would be the 3x3 region in the top-left corner of the image:

255 255 255
255 0 0
255 0 0

The second filter would have a receptive field that is offset by one column to the right:

255 255 255
0 0 255
0 0 255

And the third filter would have a receptive field that is offset by one row down:

255   0   0
255 0 0
255 255 255

Each of these filters would perform a convolution operation on its receptive field by taking the element-wise product of its weights with the input region, summing the results, and applying an activation function to produce the output of the neuron. The receptive field of a neuron in a CNN is important because it determines what the neuron is able to learn to recognize or detect in the image.

Additionally, dilation is another technique used to increase the size of the receptive field of a unit or group of units without increasing the number of parameters or computation. It is often used in CNNs to allow the network to have a larger context when processing the input. Receptive field dilation is achieved by inserting spaces (zeros) between the elements of the filters used in the CNN. These spaces cause the filters to “dilate” or expand, resulting in a larger receptive field.

For example, consider a convolutional layer with filters of size 3x3 and stride 1. The receptive field of each unit in this layer is 3x3. Now suppose we want to increase the size of the receptive field without changing the size of the filters. We can do this by adding zeros between the elements of the filters and applying the filters to the input with a stride of 2. This results in a receptive field of size 5x5, as the filters “skip over” elements of the input (Fig. 2).

Fig. 2: Dilation convolution — Image created by the author

In PyTorch, you can calculate the receptive field of a convolutional layer by using the nn.Conv2d class and specifying the kernel size and stride of the convolutional layer. In this tutorial, we will learn how to calculate the receptive field of a convolutional layer in PyTorch with different kernel sizes.

import torch
import torch.nn as nn

def get_receptive_field(kernel_size, stride):
return kernel_size + (kernel_size - 1) * (stride - 1)

in_channels, out_channels = 3, 64

# Calculate the receptive field with kernel size 3 and stride 1
conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=1)
receptive_field = get_receptive_field(conv.kernel_size[0], conv.stride[0])
print(f'Receptive field with kernel size 3 and stride 1: {receptive_field}')

# Calculate the receptive field with kernel size 3 and stride 2
conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2)
receptive_field = get_receptive_field(conv.kernel_size[0], conv.stride[0])
print(f'Receptive field with kernel size 3 and stride 2: {receptive_field}')

# Calculate the receptive field with kernel size 5 and stride 1
conv = nn.Conv2d(in_channels, out_channels, kernel_size=5, stride=1)
receptive_field = get_receptive_field(conv.kernel_size[0], conv.stride[0])
print(f'Receptive field with kernel size 5 and stride 1: {receptive_field}')

# Calculate the receptive field with kernel size 5 and stride 2
conv = nn.Conv2d(in_channels, out_channels, kernel_size=5, stride=2)
receptive_field = get_receptive_field(conv.kernel_size[0], conv.stride[0])
print(f'Receptive field with kernel size 5 and stride 2: {receptive_field}')

Here is the output:

Receptive field with kernel size 3 and stride 1: 3
Receptive field with kernel size 3 and stride 2: 5
Receptive field with kernel size 5 and stride 1: 5
Receptive field with kernel size 5 and stride 2: 9

Also, dilation convolutions can be implemented in PyTorch:

# Define a convolutional layer with dilation
conv = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, dilation=2)

# Apply the convolutional layer to an input tensor
tensor = torch.randn(1, 3, 256, 256)
output = conv(tensor)

print(output.shape) # (1, 64, 128, 128)

If you find this tutorial helpful or would like to reach out, feel free to get in touch with me on here, Github or Linkedin. Happy coding!

Additional Resources:

--

--

Reza Kalantar

Medical AI Researcher by Profession • Scientist/Engineer by Trade • Investor by Instinct • Explorer by Nature • Procrastinator by Choice