AISaturdayLagos: Basic Overview of CNN

Published in

AI Saturdays

6 min readFeb 13, 2018

Nothing in the world can take the place of persistence. Talent will not; nothing is more common than unsuccessful men with talent. Genius will not; unrewarded genius is almost a proverb. Education will not; the world is full of educated derelicts. Persistence and determination alone are omnipotent. The slogan Press On! has solved and always will solve the problems of the human race.
- Calvin Coolidge

We held AI Saturdays Lagos Week 6 on 10th Feb, 2018. Look at that, it’s Week 6 already :-)

In case you missed it, here’s a quick recap :-

We started, as usual, with Fast.ai lessons, this week was on Collaborative Filtering, we saw about 20 minutes of the lesson before we ended up on this kaggle notebook to do some hands on — we tried a different method, everyone preferred this method of kaggle hands on.

Lunch — mingle — learn

Then we moved to everyone’s favorite session — Stanford CS231n where we learnt about Training Neural Networks (part 1). Serena Yeung discussed different activation functions, the importance of data preprocessing and weight initialization, and batch normalization; she also covered some strategies for monitoring the learning process and choosing hyperparameters.

And finally, we managed to read the abstract and the conclusion of one of this week’s paper which is on Convolutional Kernel Networks. We’re still gradually cultivating the habit of reading papers.

Project — discussion — project-team — bonding

A special feature for this week is Udeme Udofia’s article on basic Overview of Convolutional Neural Networks (CNN)

Enjoy!

Basic Overview of Convolutional Neural Network (CNN)

The Principle of the Convolutional Layer, Activating Function, Pooling Layer and Fully-connected Layer

Convolutional Neural Network is a class of deep neural network that is used for Computer Vision or analyzing visual imagery.

Convolutional Layer

Computers read images as pixels and it is expressed as matrix (NxNx3) — (height by width by depth). Images makes use of three channels (rgb), so that is why we have a depth of 3.

The Convolutional Layer makes use of a set of learnable filters. A filter is used to detect the presence of specific features or patterns present in the original image (input). It is usually expressed as a matrix (MxMx3), with a smaller dimension but the same depth as the input file.

This filter is convolved (slided) across the width and height of the input file, and a dot product is computed to give an activation map.

Different filters which detect different features are convolved on the input file and a set of activation maps is outputted which is passed to the next layer in the CNN.

There is a formular which is used in determining the dimension of the activation maps:

(N + 2P - F)/ S + 1; where N = Dimension of image (input) file

P = Padding
F = Dimension of filter
S = Stride

Activation Function

Activation function is a node that is put at the end of or in between Neural Networks. They help to decide if the neuron would fire or not.

“The activation function is the non linear transformation that we do over the input signal. This transformed output is then sent to the next layer of neurons as input.” — Analytics Vidhya

We have different types of activation functions just as the figure above, but for this post, myfocus will be on Rectified Linear Unit (ReLU).

ReLU function is the most widely used activation function in neural networks today. One of the greatest advantage ReLU has over other activation functions is that it does not activate all neurons at the same time. From the image for ReLU function above, we’ll notice that it converts all negative inputs to zero and the neuron does not get activated. This makes it very computational efficient as few neurons are activated per time. It does not saturate at the positive region. In practice, ReLU converges six times faster than tanh and sigmoid activation functions.

Some disadvantage ReLU presents is that it is saturated at the negative region, meaning that the gradient at that region is zero. With the gradient equal to zero, during back propagation all the weights will not updated, to fix this, we use Leaky ReLU. Also, ReLU functions are not zero-centered. This means that for it to get to its optimal point, it will have to use a zig-zag path which may be longer.

Pooling Layer

The Pooling layer can be seen between Convolution layers in a CNN architecture. This layer basically reduces the amount of parameters and computation in the network, controlling overfitting by progressively reducing the spatial size of the network.

There are two operations in this layer; Average pooling and Maximum pooling. Only Max-pooling will be discussed in this post.

Max-pooling, like the name states; will take out only the maximum from pool. This is actually done with the use of filters sliding through the input; and at every stride, the maximum parameter is taken out and the rest is dropped. This actually down-samples the network.

Unlike the convolution layer, the pooling layer does not alter the depth of the network, the depth dimension remains unchanged.

Formular for the output after Max-pooling:

(N — F)/ S + 1; where N = Dimension of input to pooling layer
F = Dimension of filter
S = Stride

Fully-connected Layer

In this layer, the neurons have complete connection to all the activations from the previous layers. Their activations can hence be computed with a matrix multiplication followed by a bias offset. This is the last phase for a CNN network.

The Convolutional Neural Network is actually made up of hidden layers and fully-connected layer(s).

Thank you Udeme Udofia, for a beautifully written article.

AISaturdayLagos wouldn’t have happened without my fellow ambassador Azeez Oluwafemi, our Partners FB Dev Circle Lagos, Vesper.ng and Intel.

A big Thanks to Nurture.AI for this amazing opportunity.

Also read how AI Saturdays is Bringing the World Together with AI

See you next week 😎.

View our pictures here and follow us on twitter :)

Links to Resources

http://setosa.io/ev/image-kernels/
https://www.kaggle.com/jneupane12/analysis-of-movielens-dataset-beginner-sanalysis?scriptVersionId=1712336
https://towardsdatascience.com/various-implementations-of-collaborative-filtering-100385c6dfe0
Practical deep learning for coders
Deep learning Theories
Convolutional Neural Networks
A friendly introduction to Convolutional Neural Networks and Image Recognition
Setting up Google Colab 1
Setting up Google Colab II