CNN: Convolutional Neural Network

Nikhil Agarwal
2 min readMay 27, 2020

--

Convolutional Neural Networks are majorly used for Computer Vision. Based on an image, the model predicts the output by analyzing the features present in the image.

So how does the machine do this?

Architecture of CNN

Src: https://towardsdatascience.com/basics-of-the-classic-cnn-a3dce1225add

The above image gives a well-defined architecture of how CNNs work.

Initially the input image is seen as an array of numbers, which are pixel value, to the machine.

Convolution Layer

In the convolution layer, the input is multiplied by a filter (i.e. actually the input is being compared with the filter). The filter is also known as kernel. Filters are the specific features of the image.

For eg. If you are checking if there is a dog in the image, the features can be eyes, ears, legs, etc.

This process helps to find the features in the image, both whether they exist and if they do, then their location in the image.

The value is then passed to the ReLU activation function.

Pooling

Once the features are captured, the image is simplified so as to focus more on only the captured features. This is known as aggregation of features also known as Pooling.

The convolution and pooling processes are repeated multiple times for better accuracy and functioning.

Flatten

Once the convolution and pooling are done, the image is converted to a 2d form in the form of an input layer of a deep neural network. Here all the features are individually present.

Fully Connected

The fully connected layer is the hidden layer of the neural network. Here, all the features are connected and seen so as to make a recognition out of the features.

Softmax

The softmax is the final layer. This layer tells the output in the form of probability. (96% chance it is a dog)

We have talked about filters. But how does the model know a filter?

Well, it learns this by itself from the vast amount of training data that we provide to the model. The model learns from the training data, specific patterns that are common to many images, and thus creates a filter of the same by itself.

--

--