An Illustrated Explanation of Performing 2D Convolutions Using Matrix Multiplications

___
6 min readApr 6, 2019

Introduction

In this article, I will explain how 2D Convolutions are implemented as matrix multiplications. This explanation is based on the notes of the CS231n Convolutional Neural Networks for Visual Recognition (Module 2). I assume the reader is familiar with the concept of a convolution operation in the context of a deep neural network. If not, this repo has a report and excellent animations explaining what convolutions are. The code to reproduce the computations in this article can be downloaded here.

Explanation

Small Example

Suppose we have a single channel 4 x 4 image, X, and its pixel values are as follows:

Further suppose we define a 2D convolution with the following properties:

Properties of the 2D convolution operation we want to perform on our image

This means that there will be 9 2 x 2 image patches that will be element-wise multiplied with the matrix W, like so:

All the possible 2 x 2 image patches in X given the parameters of the 2D convolution. Each color represents a unique patch

These image patches can be represented as 4-dimensional column vectors and concatenated to form a single 4 x 9 matrix, P, like so:

--

--