Create a Convolutional Neural Network with TensorFlow

This portion of the blog requires a bit of prior knowledge on what neural networks are and how they function. Here’s two links (1 & 2) to a relatively comprehensive introduction to neural networks.
Brief Introduction to Convolutional Neural Networks

Convolutional Neural Networks (CNNs) take a bit of inspiration from biological processes, primarily that of the visual cortex in the brain. A 1962 experiment led by Hubel and Wiesel, led to the discovery that certain individual neurons fired only in the presence of edges of a particular orientation and location given a visual stimulus. The experiment also entails the neurons having been layered in a columnar structure attributing to visual perception.
ConvNet Structure
In short, CNNs can be described as having a convolutional, pooling, and fully-connected layer. Each CNN layer transforms a 3D input volume to a 3D output volume of activations and the neurons in a CNN are arranged as (width, height, depth).

Convolutional Layers
Each convolutional layer has a number of filters, with their own stride and padding, consisting of learnable filters to produce a 2D activation map for filters at every spatial position.
As a hyperparameter, depth represents the number of filters to use, each of which are trying to learn something different within the input. E.g. edges, color blobs, etc. The stride of a filter, represents by how much to move the filter across the input at a time (usually 1 or 2). The size of the zero-padding, “same” padding will bound the outer edges of the convolved image with 0’s in order to keep the dimensions of the image the same. Meanwhile, “valid” padding will not bound the convolved image with 0’s and thus reduces the dimensionality of the features.

To calculate the spatial size of the output volume, consider the following:
- Let W be the input volume size
- Let F be the receptive field size of the Conv Layer neurons
- Let P be the amount of zero padding
- Let S be the stride of the filter

Pooling Layers
The main function of the pooling layer is to reduce the number of features, while still retaining relevant information that might have been discarded during the dimensionality reduction.

Fully Connected (Dense) Layer
The fully connected layer is used to associate features to a single class from a set of possible classes. As shown below, the preceding activation layer is fully connected to each neuron present in this layer and the output is a probability distribution for the likelihood of each class being correct.

For a much more in depth tutorial on ConvNets, please refer to this link.
Implementation
This implementation of a ConvNet is for classifying data in the MIST dataset, which is comprised of gray-scale 28 x 28 pixel images of written digits [0–9].
More information on the dataset can be found here.
The Input Pipeline
- Extract
- Transform
- Load
The Model Architecture
Conv Layer 1:
- 32 filters
- Kernel of size 5 x 5 pixels
- Stride of 1 pixel
- “Same” padding
- ReLU activation
Pooling Layer 1:
- Max Pool Operation
- Kernel of size 2 x 2 pixels
- Stride of 2 pixels
Conv Layer 2 is the same as Conv Layer 1, except for there now being twice as many filters as before.
Pooling Layer 2, will have the exact same parameters as Pooling Layer 1.
Fully Connected Layer:
- Reshape and flatten the output from the previous layer
- Set the number of neurons desired and activation type
- Use dropout if needed, to prevent overfitting
Model Evaluation
The full code for this part can be found here.
Running the Code Step by step
In setting up the code on your machine, I’ve provided a requirements document from which you can simply type the code in to the terminal. (Window’s users should use the URL provided in a browser for the python installation)
The code below covers the following:
- Installing python2 or python3
- Installing pip or pip3
- Installing virtualenv
- Installing all the needed dependencies within the “activated” virtual environment
- Running the code and deactivating the virtual environment
Adding Extra Features:
The full code for this “second” part can be found here. (Please note that the part below cannot be done without saving the summaries of your model)

Visualizing The Results w/ TensorBoard
- Reactivate the virtual environment
- Run the following in the terminal:
tensorboard --logdir=path/to/log-directory

The repository for the code can be found here.
