Flattening CNN layers for Neural Network and basic concepts

Muhammad Shoaib Ali
3 min readJun 23, 2022

--

Why Deep learning?

In real world data is increasing constantly. when amount of data always increasing then at a certain point traditional machine learning algorithms shows maximum accuracy and after that specific interval fails to increase accuracy. There outshines deep learning where algorithms constantly increases accuracy with the increasing amount of data. ML also doesn’t perform well over high dimensional data.

let’s see how model layers works firstly

  • Convolutional layer

Image filtering (kernel) is process modifying image by changing its shades or colour of pixels. it is also used for brightness and contrast.

kernel size 3x3 in convolutional layer of channel 1
  • Pooling layer

Pooling layer used to reduce feature map dimension's. Thus it reduces no. of parameters to learn and amount of computation performed in network. pooling layer summarises features present in a region of feature map generated by convolutional layer.

dsdsasdsadadsa
Max and Average pooling layers

In this image kernel size is 2x2 and stride 2. which means kernel steps twice.

Max pooling layer finds max in 2x2 kernel of input image (like max in light blue kernel area out of [8,7,12,9] is 12)

Average pooling layer takes average of 2x2 kernel (like in blue areas [8+7+12+9]/4 = 9)

  • Padding layer
Padding effects output image size while filtering in Conv/padding layer

Its similar like convolutional layer as it refers amount of pixels added to an image when it is being processed by kernel or filter. when don’t use stride then by default is 1. Half padding mean half of filter size and full padding mean padding equal to size of filter/kernel. Padding is done to reduce the loss of data among the sides/boundary of the image.

output size of image calculated using this formula [(W−K+2P)/S]+1.

  • W is the input volume
  • K is the Kernel size
  • P is the padding
  • S is the stride
  • Flatten operation

Intuition behind flattening layer is to converts data into 1-dimentional array for feeding next layer. we flatted output of convolutional layer into single long feature vector. which is connected to final classification model, called fully connected layer. let’s suppose we’ve [5,5,5] pooled feature map are flattened into 1x125 single vector. So, flatten layers converts multidimensional array to single dimensional vector.

  • Connected Components

The model take input image of size 28x28 and applies first Conv layer with kernel 5x5 , stride 1 and padding zero output n1 channels of size 24x24 which is calculated by the output of a pooling layer is (Input Size — Pool Size + 2*Padding)/Stride + 1.. then poling layer same like conv but this time filter size 2x2 and stride 2, when we calculate using Conv layer formula outputs are 12x12 of same channel n1. i repeats similar way once again and at the end flatten layer converts two dimensional arrays to one dimensional vector.

Conclusion:

we gone through basic convolutional layers details and components which are basic component for working with CNN. In the end of this article we classified image.

--

--