Using Convolutional Neural Networks in Tensorflow to Analyse Chest XRays

Published in

Analytics Vidhya

4 min readOct 26, 2020

In this short article, we will show how TensorFlow can be used to easily classify image data using deep neural networks. We will showcase the method using the Chest XRay image dataset available on Kaggle. Because this is an image based dataset, we will utilise Convolutional Neural Networks , along with max pooling to obtain our predictions.

Hopefully, this will serve as a guide that anyone can follow and try themselves to fit the model. The inner model layers can easily be altered (adding or deleting layers, adding dropouts, etc).

Chest XRay Dataset:

The Chest XRay dataset consists of thousands of images of XRays taken. The images are classified either as healthy, (Normal), or unhealthy (Pneumonia). The dataset is large and contains .jpeg files, and thus it is saved in a zipfile. We use the zipfile package to unzip the data, and create the respective training and validation directories.

Load in images:

Once you have downloaded the images, set your working directory to the location they are saved.

Figure 1: Sample of Images taken from my python code. This image shows a sample of 16 images. The first top half (8 images), are healthy, the second 8 have pneumonia.

We train the model using a regular Convolutional Neural Network(CNN), with just 3 convolutions and Max Pools, and a single dropout layer. We also use the relu activation function for our inner layers.

A sample of four healthy lungs plotted in gray scale.

Figure 2: A sample of four sick lungs plotted in gray scale.

Define our TensorFlow Model:

We use four convolutional layers, each followed by a max pooling layer. We double the number of filters in each convolutional layer. We set a dropout layer of 0.3 after our final relu activation. This will increase validation accuracy (by helping to prevent over-fitting).

We can also look at a summary of our model:

Figure 3: Our model summary. We have several convolutional layers, max pooling and a dropout layer.

Image Augmentation:

Each XRay image is approximately 180 px by 180 px, thus our input_shape will be set as (180,180,3). The 3 is the dimension for the three colours (RGB). We will use ImageDataGenerator to assist in creating additional augmented images. We set the ‘fill model’ parameter of ImageDataGenerator to ‘nearest’ (to fill in any missing portions of images). We rescale by 255 to scale the intensity of each pixel.

Compile and Fit Model:

We use RMSprop optimiser to perform gradient descent, with a learning rate of 0.0001, this ensures we take small steps.

We now fit the model. We specify 25 epochs, and we let TensorFlow decide the other parameters such as steps_per_epoch and validation_steps.

Visualise inner Layers:

One great feature of TensorFlow is the ease in which we can access the inner layers of the Network. Firstly, let us look at how our model fitting went.

Figure 4: Summary of the last few epochs of our model fitting. Each epoch took approximately 2.5 minutes.

Each epoch took just over 2.5 minutes, and the accuracy increased both for training and validation. Our loss seemed to be plateauing, so 25 epochs is likely enough.

We can also view what the inside layers looked like.

Figure 5: Inside a convolutional layer. These layers help to pick up edges and contrasts in images.

Figure 6: Inside a max pooling layer. Max pooling averages and provides a simplification throughout the image. This allows for us to recognise dimensionally reduced features more easily

Plot Accuracy and Loss:

Finally, we can examine the accuracy and loss plot per epoch.

Figure 7: Our training and validation accuracy. Both rise and level off around 90%

Figure 8: Our training and validation loss.

And that is it! We have trained a model that classifies lung condition correctly with an accuracy of over 90% on the test set and 94% on the training set.

Summary:

Fitting a simple classification model using CNN’s in TensorFlow is simple.
ImageGenerator can be used to augment images and increase our training set.
Over fitting can be remedied by increasing dropout.

Thanks for reading!

Github with code:

https://github.com/Robby955/ChestXray