Why Convolution Neural Networks are better than feed forward networks?

Introduction to Alexnet & ZFNet

Akshat
Analytics Vidhya
4 min readOct 18, 2019

--

Deep Learning is such an interesting topic in today’s world. It has such a power to predict things with it’s little complex but interesting algorithms.

While I am still studying this topic, I was really fascinated about the Convolution Neural Networks. After my dive into Feed Forward Neural Networks, I had this question in my mind that why are we going to move to CNN’s, if these Feed Forward Networks are so interesting and they are strongly capable of performing classifications..? I wasn’t really up to study CNN’s but still I took my pen & paper and started doing my research for CNN’s over the internet. All I wanted to know was why are they so popular and how are they better than FFN’s? So, I started reading this article online. To my interest, I wanted to know more about CNN’s and after a time span of few weeks, I’ll tell you why they are such an interesting topic.

CNN on the left & FFN on the right

If we have an image as an input, each & every pixel will have a different weight associated to it and it will have three values(rgb values) associated to it. So, we can apply feed forward networks to them but if a standard image say of size 227*227 is input then the number of parameters become 227*227*3. Roughly, 10⁴ number of weights will be associated with the image. So, 10⁴ number of neurons will be required in one single layer of the network which is really incompatible and bothersome work. Hence, total of millions of parameters and neurons will be required in one single feed forward network, so they are incompatible for handling images. In CNN’s a kernel is built (kernel is basically a matrix of weights) and the weights are shared as the kernel moves horizontally and vertically across and image. The maxpooling operation directly cuts the number of parameters by half. Then there’s a concept of padding and stride which further decreases the parameter size of the image.

The CNN’s look really complex 3-d connections but it’s an easy topic especially because we use frameworks like Pytorch :). Also, as we proceed to next convolution layers the kernel size decreases and the parameters continue to decrease unlike Feed Forward networks.

Let’s discuss about two old but interesting pre built CNN’s:

AlexNet

AlexNet Structure

The structure gives rise to a total of 27.55 million parameters out of which 24 million are produced from last three fully connected layers only and the rest from Convolution network behind it. The kernel size has decreased or remain same as we proceed. Total 96 parameters are there in the first Convolution Layer and each of them moves horizontally & vertically across a 2-d input layer. The 11*11 kernel size means that we are capturing a large area of pixels in the image. About the parameters, there is no proper method to determine why we use those exact parameters. Deep Learning is a hit and trial process where we check accuracy and keep the parameters where accuracy is high.

Error rate in this network is 16.4%.

AlexNet was trained for 6 days simultaneously on two Nvidia Geforce GTX 580 GPU’s which is the reason why the network is split in two pipelines. Isn’t that awesome.. :D

More about AlexNet : Just Deep Learning things!

ZFNet

ZFNet Structure without Fully Connected layers

The ZFNet structure is similar to the AlexNet structure but different parameters on some layers. ZFNet chose to select less amount of pixels from the image so the first kernel size is 7*7 unlike 11*11 in AlexNet. The kernel size is decreasing or remaining same as we move forward in the network. There are a total of 8 layers in the network with 5 convolution layers and 3 fully connected layers. The maxpooling layer is not counted as a different layer because it has no parameters associated to it.

This architecture won the ImageNet competition in 2013, achieving the error rate of 14.8%.

More on ZFNet & the competition: Just Deep Learning things!

Deep Learning is an interesting & awesome field to dig in. So, I would highly recommend people who want to get started with Machine Learning or #100DaysofMLCode to study Deep Learning first because you will be really able to understand the meaning and working of algorithms and processes that happens behind just calling functions in ML.

Machine Learning becomes a piece of cake once you are done with Deep Learning! :)

Happy Training and Testing!

--

--