Deep Learning ‘Hello World!’

Sagini Abenga
The Andela Way
Published in
5 min readJun 6, 2018
  • This is a simple introduction to deep learning in python using the MNIST dataset. The dataset can be found here.
  • This article assumes basic knowledge of python and machine learning concepts.

Definitions

Artificial Intelligence: the theory and development of computer systems able to perform tasks normally requiring human intelligence.

Machine Learning: according to Tom Mitchell, a computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E. In simpler terms, it is the branch of Artificial Intelligence concerned with enabling machines to learn from data instead of being explicitly programmed with regard to a specific task, kind of like telling a computer ‘Hey, find me patterns in this data’.

Deep Learning: This is the branch of machine learning that uses artificial neural networks as the basis of learning. Artificial neural networks are a simplification of the biological neurons in our brains as information processing systems. There is emphasis on the word ‘deep’ as these neural networks tend to have very many layers depending on the complexity of the task at hand.

Deeper differentiations of these terms can be found here and here.

  • As we can see, deep learning is a subset of machine learning which is a subset of artificial intelligence.
  • Keras is a python library that eases the building of deep learning models. It’s documentation can be found here.

Working Environment

  • Ensure you have the latest Python3 installed in your machine and a virtual environment utility e.g `virtualenv`
  • Create a folder e.g mnist
  • Create a virtual environment and activate it. Check here for more on virtual environments.
  • Install the necessary libraries:
  1. python-mnist (loading the ubyte files )
  2. tensorflow (support keras computations)
  3. keras (build neural networks)
  4. tqdm (display progress of looping functions)
  5. h5py (store keras models)

MNIST

  • This is the Modified National Institute of Standards and Technology database. It is a subset of original NIST database which is much larger.
  • The MNIST database contains images of handwritten digits and their correct labels. It is preferred as an introductory dataset to deep learning because of its simplicity.

Data Preparation

  • Download the files here
  • Uncompress the files(leave the file names unchanged).
  • In the mnist folder we created, create a folder data (/mnist/data/).
  • Move the uncompressed files into the data folder
  • In the root folder(mnist), create a file called load_data.py. This will contain scripts to be used in loading and preparing data for the network to consume. Also, create another one called network.py . This will contain scripts that build and run the neural network. Finally, create a folder models which will be used to store the trained model.

Scripts

Add the following gists into load_data.py

The first thing we do is make the necessary imports:

MNIST opens the ubyte files that store the datasets. array converts native python array-like objects e.g lists into numpy arrays. tqdm displays progress bars for looping procedures.

To load the data:

I decided to combine the training and testing data so we can have a custom split.

Prepare the outputs(labels) for a classification problem:

This is one-hot-encoding of the outputs(labels e.g 0,1,2,3 etc). It converts them to classes. An example would be, if we had 3 colours (red, green and blue), red becomes [1,0,0] , green becomes [0,1,0] and blue [0,0,1]. This is because this is a classification problem and thus we do not need to predict actual values, but classes objects belong to. For more context, check out more here.

To normalise and reshape the input images:

This is done contain the pixel values between 0 and 1 . The reshaping converts the one-dimensional array into a two-dimensional array as an image is. The normalisation is done by dividing all pixel values by 255 , which is the highest value possible in a grayscale image pixel.

We then split the data, 40000 data-points for training and 30000 for testing.

We’re now ready to get started with the neural network section.

Convolutional Neural Networks

Most of data we encounter is one dimensional, i.e, can be represented as an array for example [var1, var2, var3…] . However, images are two dimensional so they are represented as array of arrays for example [[var1, …], [var2…], …] . Regular feed-forward neural networks take in one-dimensional arrays. To analyse data with more than one dimension, we need to get more creative in its analysis. This is where convolutional neural networks, abbreviated as CNN ,come in.

CNNs are inspired by neural processes in the visual cortex of mammals. This process is highly complex but to make a simple abstraction, several types of layers are needed.

  • Convolutional layer: here, we create a feature map by sliding a set of weights across the map and taking a dot product of the weights and the pixel values. This is the convolution. The array slid over is the filter .
  • Pooling layer: here, outputs from convolutional layers are combined into a single neuron in the next layer, ‘summarising’ the information.
  • Activation layer: here, non-linearity is introduced using an activation function for example a rectified linear unit. I this example, we’ll use relu which returns the maximum value of a function and defaults to 0 as the minimum i.e Activation(x) = max(0,x), this means if x is less than 0, it will be returned as 0.
  • Dropout layer: here, to prevent overfitting of the network, some of the neurons are made to activate 0 by default. Overfitting is whereby the network memorises the data instead of learning patterns in the data. It will have excellent results within the training data but fail once it is tested on data it has not encountered.
  • Flatten layer: in keras, this is used to convert two-dimensional layers into one-dimensional layer.
  • Fully Connected layer: this is a regular feed-forward network that gives us an output.

For a more intuitive understanding of how Convolutional Neural Networks work, check out this and this.

Scripts

Add the following gists into network.py

Make the necessary imports:

Create the network:

We use the keras sequential model (the alternative is the functional API), which takes a layer and appends it to the previous one.

Compile the network:

This is an efficiency operation. It converts the created model into a sequence of matrix operations enabling quicker computation by TensorFlow or Theano. We specify the loss function that will be used to determine how well we’ve learnt, the algorithm to be used to train it and the metrics that would be used to calculate how accurate the trained model is.

Fit the model:

We’ll define a checkpoint to store the best model at the current time. It is stored in the /mnist/models/ directory. We save a new model only of its better than the current best (has a higher categorical_accuracy).

We then train the network.

Evaluate the model:

Here, we check how well model we’ve trained performed.

To train the network, run: python network.py

Upon running, the result should look like:

The complete code can be found here.

--

--

Sagini Abenga
The Andela Way

Software Developer at Andela Kenya, interested in Artificial Intelligence.