Convolutional neural networks

Steven
3 min readDec 22, 2021

--

First, in order to describe a convolutional neural network, we must understand what a neural network is. We will use a common example for neural networks, which is number recognition.

How is it that we are able to recognize a series of lines as numbers? How does our brain match these with 4?

Four 4s

What if you wanted to program something that is able to recognize a number in a 28*28 image? Something like this can become amazingly complex.

Compared to classical machine learning, deep learning uses more layers to do feature extraction as well as classification, whereas classical ML only does classification.

What is a neural network?

First, a neural network is made of neurons (which are able to hold a number) that are linked to one another. For a 28*28 grayscale image, the neurons values could be the color of each pixel (0 for black and 1 for white). We’d have 784 pixels → 784 neurons in the first layer of our network.

The last layer has 10 neurons, each one for a number. The neuron from the last layer with the biggest value is the one that’s number is recognized.

Hidden layers are the neuron layers in the middle, here 16 neurons each

Each link between neurons have weights, and we calculate the next neuron’s value by adding the weight*value of the neurons in the layer before.

This sum is subjected to a value called “bias”, which, if the sum is superior to, makes the neuron meaningfully active.

In order to learn, the neural network’s weights and biases are tweaked, so the result’s accuracy improves.

What is a convolutional neural network?

Well, we’ve already talked about convolutional layers : they are the hidden layers in the neural network. They are able to detect patterns, and are adapted for image recognition problems. However, instead of each neuron being connected to every neuron in the previous layers, neurons in CNN are only connected to the ones that are close to it. Also, all of them have the same weight.

The layers are initialized with random numbers, and filters are applied to the image in order to detect features inside it : the image is first simplified in order to be easier to process and understand.

The CNN architecture is specific. There is a convolution layer, a pooling layer, a “RELU” layer, as well as a fully connected layer.

The convolution layer applies a filter on the image which can bring out specific features. The pooling layer down-samples the feature map and makes it faster to process. In order to perform classification, the fully-connected layer is added next.

What’s the difference between a CNN and a recurrent neural network (RNN)?

A CNN is a feed-forward network that filters special data, whereas an RNN feeds data back to itself : CNNs can detect patterns in space and RNNs in time.

We will learn about RNNs in the next post.

--

--