Convolutional Neural Network (CNN) implementation for Diabetic Retinopathy Detection with TF

Swanand Mhalagi
5 min readApr 17, 2019

--

The aim of this tutorial is to develop automated detection system for diabetic retinopathy using CNN. This was one of the competition held on Kaggle. You need to create an account on Kaggle to be able to download the database. The link for ipython notebook containing this code is at the end of this tutorial.

I have used a Paperspace 12 VP CPU instance. We need to install few linux packages such as Werkzeug, Flask, numpy, Keras, gevent, pillow, h5py and tensorflow.

Lets start directly with hands on training

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import keras
from tqdm import tqdm
import os
from sklearn.model_selection import train_test_split
from cv2 import cv2
from PIL import Image
import tensorflow as tf
from matplotlib import pyplot as plt
from keras.layers import Dense, Dropout, Flatten, Input
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.preprocessing import image
from keras.utils import plot_model
from keras.models import Model
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from numpy import array

I have already downloaded the dataset from Kaggle. (Once you download the dataset you have to unzip all train and test files, you have to concatenate all the train files to create a simple folder for train images)
Below I am loading a CSV file containing training labels

df_train = pd.read_csv('/storage/trainLabels.csv')

Lets take a look at all the labels.
‘10_left’ is a name of the file whereas ‘0/1/2/3/4’ are the labels.
‘10_left’ image of the left eye similarly ‘10_right’ image of the right eye for the same person

df_train.values

array([['10_left', 0],
['10_right', 0],
['13_left', 0],
...,
['44348_right', 0],
['44349_left', 0],
['44349_right', 1]], dtype=object)

There are 35125 images in the training set, ‘level’ is the column indicating the labels for its respective images

df_train.tail()

We will use Pandas to convert df_train into a series and get_dummies to do one hot encoding (FYI, I am not using one hot encoding during training as of now)

targets_series = pd.Series(df_train['level'])
one_hot = pd.get_dummies(targets_series, sparse = True)

As I said before there are 5 types of lables 0/1/2/3/4, they are distinguished as below (NDPR — Non Proliferative Diabetic Retinopathy)

Class — Name
0 — Normal
1 — Mild NPDR
2 — Moderate NPDR
3 — Severe NPDR
4 — PDR

targets_series[:10]
one_hot[:10]

Lets take a look at the array containing just the labels

one_hot_labels = np.asarray(one_hot)
one_hot_labelsY = np.asarray(targets_series)
one_hot_labelsY[:10]

array([0, 0, 0, 0, 1, 2, 4, 4, 0, 1])

Now we will initialize the shape of the image and the arrays to load images and labels

im_size1 = 786
im_size2 = 786
x_train = []
y_train = []

If you are interested to check all image names

i = 0 
for f, breed in tqdm(df_train.values):
print(f)

10_left
10_right
13_left
13_right
15_left
15_right
16_left
16_right
17_left
17_right
19_left

I am creating a sub set of 1000 images out of total 35125 images

df_test = df_train[:1000]

If you plan to run this code on all 35125 images then replace df_test with df_train. The below code snippet will load all the images and labels into a numpy array.
You can also load images using OpenCV, I have mentioned the OpenCV code in comments below

”””
#this is a OpenCV implementation
i = 0
for f, breed in tqdm(df_train.values):
if type(cv2.imread('/storage/train/{}.jpeg'.format(f)))==type(None):
continue
else:
img = cv2.imread('/storage/train/{}.jpeg'.format(f))
label = one_hot_labels[i]
x_train.append(cv2.resize(img, (im_size1, im_size2)))
y_train.append(label)
i += 1
np.save('x_train2',x_train)
np.save('y_train2',y_train)
print('Done')
"""
i=0
for f, breed in tqdm(df_test.values):
try:
img = image.load_img(('/storage/train/{}.jpeg'.format(f)), target_size=(786, 786))
arr = image.img_to_array(img)
label = one_hot_labelsY[i]
x_train.append(arr)
y_train.append(label)
i += 1
except:
pass

100%|██████████| 1000/1000 [01:43<00:00, 7.06it/s]

Lets just verify one of the images from the numpy array

plt.imshow(x_train[681]/255) #681 > Try some other number too 
plt.show()

Its important to split the whole dataset into training and validation dataset apart from testing dataset which we have separately.

x_valid = []
y_valid = []
X_train, X_valid, Y_train, Y_valid = train_test_split(x_train, y_train, test_size=0.1, random_state=1)

Now we will define the model >>
The model has 2 convolutional leyers, 2 max pooling layes, image flattening layer and a dence layer. Models in Keras/TF come in 2 forms — Sequential (model = Sequential()) or using Funtional API.
Below code is using Funtional API which is usually used for complex models, I will leave the light weight Sequential() model in comments. You can modify the layers to make your own custom model.

visible = Input(shape=(786,786,3))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
flat = Flatten()(pool2)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)

If you are planning to run less complex model run below lines, i will suggest to use a transfer learning technique if you are planning to use below model for better results

model = keras.Sequential([
keras.layers.Flatten(input_shape=(786, 786, 3)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])

There are variety of optimizers available >> https://keras.io/optimizers/
More about loss functions >> https://keras.io/losses/
Metrics >> https://keras.io/metrics/

Feel free to play around with these metrics.

model.compile(optimizer='adam', 
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

Lets convert the array into a numpy array, this might take some time

y_train_raw = np.array(Y_train)
x_train_raw = np.array(X_train)

This is how the layers are stack on top of each other

model.summary()

This command will actually train the model. Even with less number of images, you might come across ‘Insufficient memory error’ or ‘Kernel restart error’

model.fit(x_train_raw, y_train_raw, epochs=5)

Lets convert the array into a numpy array, for validation dataset

x_valid_raw = np.array(X_valid)
y_valid_raw = np.array(Y_valid)

Once the model is trained, we need to evaluate the performance of the model with all validation dataset.

test_loss, test_acc = model.evaluate(x_valid_raw, y_valid_raw)
test_loss
test_acc

You can find the iPython node book at https://github.com/swanandM/Diabetic-Retinopathy-Detection-with-TF.git

You can modify this code to implement few things such as data augmentation, batch normalization ,dropout and custom loss functions for better performance.

--

--