Black-(Blind Or Not)

Anwoy panigrahi
Analytics Vidhya
Published in
7 min readSep 25, 2020

Overview : This blog is all about detecting diabetic retinopathy which is the leading cause of blindness among working aged adults living in rural areas where medical screening is difficult to conduct. The idea is to detect this problem at an early stage so that the blindness could be prevented, with the help of machine learning . Main point of using machine learning is that the whole process could be spead up and it can be remotely accessed where there is no doctor.

The full code is available in my Github profile

Table of content:

— — About the problem

— — About The Data Set

— — Metrics

— — Loading the data set

— — Exploring The Data

— — Data augmentation

— — Data preprocessing

— — Preparing image data generator

— — Modeling

— — Comparison Table

— — Kaggle score

— — Future Work

— — Reference

About The Problem:

As shown in the above image these are the five irregularities in an unhealthy eye i.e these irregularities tend to diabetic retinopathy. So we have to process images to find how severe the problem is.

About The Data Set:

It is a Kaggle problem in which the data set consist of 5 different classes of image but the images provided contains noise as all real-world data and its a highly imbalanced data set having more than 1700 points of 0 class and less than 250 images of class 4. The training set consist of 3662 images and the test set contains 1928 images.

Metrics:

The metrics used for evaluating the prediction is kappa metrics. Kappa score measures the agreement between two raters who each classify N items into C mutually exclusive categories.

po is the relative observed agreement and pe is the probability of chance agreement

If the raters are in complete agreement then k =1. If there is no agreement among the raters than the value of is 0 i.e k=0.

Loading the data set:

now saving the path of train images in the train data frame

Exploring The Data:

— — — — — plotting a histogram of image labels:

From the plot, it's clear that there is a lot of data imbalance.

— — — — — — plotting some images of different classes:

From the plot, we can see that images differ a lot in size and intensity.

Data augmentation:

As we know that the data provided is highly imbalance, one technique we can use to deal with it is upsampling. Upsampling is basically creating data points from the existing data by using different augmentation techniques. In this blog, we will use horizontal and vertical flip augmentation on classes 1,4 and 3 as their proportion is less as compared to classes 0 and 2 and save those images so that we can use them while training.

creating a data frame having 1,4 and 3 class labels

The function below returns two images one vertically flipped and another one horizontally flipped for a given image.

Now we need to save those images so that we can use them later. Before that, we need to convert the image into an RGB format because cv2 reads images in BGR format.

After augmentation, the data set looks like this, and we have a total of 5062 data points.

From the plot, it's clear that we have managed to reduce class imbalance to some extent

Data preprocessing:

For the preprocessing part, we will be cropping all images, will use image brightening technique used by Ben(winner of the previous competition ), and finally use thresholding for better results.

So let's start with cropping all the mages. In this step, we will crop all the dark pixels so that we have a clear vision of the image. But if the image is fully dark then we will crop everything hence we need to return the original image if the shape after cropping is 0.

Now coming to the image brightening technique used by Ben. In this function first, we will convert the image from RGB to BGR because ImageDataGenerator reads the image in RGB mode whereas cv2 reads in BGR mode. After that, we will resize images and add weights to them.

After that, we will apply the thresholding technique so that the irregularities are easily visible.

After the image brightening, I also tried using the canny edge detection technique But it could not detect all irregularities present in the image

Let's visualize the output of all the preprocessing we have till now.

From the image, it's clear that canny edge detection is not working properly so we will use thresholding

Preparing image data generator:

Now we need to load all images and split them into training and validation.

Splitting data into train and validation set. While splitting I have done 80–20 split and used stratify because the data is imbalanced. After splitting we have 4049 training data and 1013 validation data.

Now let's see the distribution of train and validation data set.

train_set distribution

valid_set distribution

It's clear that both the train and validation set have the same distribution.

We are ready to train the model.

Modeling:

— — — — — Base model:

For the base model, I did not train any model I just used pre-trained Vgg16 on top of which I used a dense layer with 5 parameters and softmax activation function. The results are not that good as it misclassifies many 2 and 3 class data points.

Training of base model

The confusion matrix of the base model looks like this.

— — — — — Basic cnn model:

Just for compassion, I tried to implement a CNN model which is a basic one. The model seems to be a dumb one which classifies all the data point to class label 1.

Model architecture

Training of CNN model

The confusion matrics look like this

Clearly, it’s a dumb model that learns nothing.

— — — — — Transfer learning:

For transfer learning, I used EfficientNetB3 with 300*300 image size and 16 as the batch size. I have used categorical cross-entropy as the loss function and custom AUC as metrics.

First, let's initialize the model with pre-trained ImageNet weights and on top of it using a dense layer to treat the problem as classification.

Now comes the training part, while training the model I have used ReduceLROnPlateau as callback. Its function is to reduce the learning rate if the model is not converging. Many times what happens while training is that the model is not converging means it's optimized but the performance is not that good this is because of the learning rate, probably it's overshooting the minima so if we reduce the learning with epochs than the convergence becomes easier and gives better performance.

The result of the EfficientNet is quite satisfying. Its confusion matrics look like this.

We can see that it's misclassifying class 3 data points but still the results are good enough.

— — — — — Ensemble model:

To reduce the misclassified point of the efficient net, I tried implementing an ensemble model by combining efficient net, vgg16 and rest net but the result was not that different. I used a majority vote of three to predict the final label.

RestNet50 architecture

Vgg16 architecture

EfficientNetB3 architecture

Then comes prediction using these three models for which I have used the majority vote.

The confusion metrics of the ensemble model look like this.

Clearly, there is not much difference between efficient net and ensemble models.

Comparison Table:

Kaggle score:

My score:

— — — Private score: 0.9152

— — — Public score: 0.763837

Top leader board score :

— — — Private score: 0.936129

— — — Public score:0.856139

Future Work:

The results could be further increased by using more models while ensembling or by changing threshold values or by using other techniques of extracting information from images.

Reference :

My LinkedIn profile

--

--