Malaria Detection in Blood Sample Images using Deep Learning

Learning In Chunks

Susant Achary
Analytics Vidhya
4 min readOct 26, 2019

--

Source: Wikipedia

Malaria is a mosquito-borne infectious disease that affects humans and other animals. Malaria causes symptoms that typically include fever, tiredness, vomiting, and headaches. In severe cases it can cause yellow skin, seizures, coma, or death. Symptoms usually begin ten to fifteen days after being bitten by an infected mosquito.

Regions of Spread:

Source: Bing Search

Diagnosis:

Source: Wikipedia

the above image paved my way to gather blood image samples, as doctors collect the blood samples of patients for detection of disease.

So without much talking get straight to coding, Today we will drive through a quick code to get a sense of Malaria Detection in Blood Sample Images, from finding this data in Kaggle Datasets to build a simple yet powerful Classifier to for Parasitized Samples and Uninfected Samples in blood images.

Note before proceeding : I will definitely recommend you to use Kaggle Kernel over Google Colab. Why ? See below.

Source : Kaggle Kernel(type -> !nvidia-smi) 16 GBs of GPU Ram and Nvidia Tesla P100
Source: Kaggle Kernel(type -> !nvidia-smi) 11 GBs of GPU Ram and Tesla K80

Definitely P100 > K80 in terms of GPU Memory, which helps CNN for Training Large Models.

Code:

  1. [Load] the data in the Kaggle kernel (dataset is available in Kaggle datasets)
  • [Default] kernel loads with few imports and display the dataset if linked.
  • load necessary[ libraries] for building a classifier.

2. Datasets gets loading and describes they are [.png format images].

3. [Visualize] parasitized sample images

uninfected sample images

4. Resized the images to w-64,h-64 ; otherwise it might run into memory error(it happened for me at 128,128) and convert it Keras image array format.

  • see the image after resize.

5. Split the dataset for training process.

6. Import the pretrained(VGG16,19, ResNet, Inception Net,RetinaNet) model, this is done to save some time and achieved good results, as we know pretrained model have been trained on large datasets of data which has learned great insights which would help in better understanding of patterns in new dataset.

  • remove the top layers( as they are trained on 1000 classes of ImageNet and modify it to our use for two classes).
  • on calling the model with parameters, results the summary as well.
  • set the loss function, optimizer and accuracy metric

7. Get the training started and look we did achieved a very good accuracy of ~96%.

  • learning curve also looks as per training values
  • results are out on Test Split of Dataset.

Code Link:

https://www.kaggle.com/susant4learning/malariadetection-in-bloodsamples?scriptVersionId=22550587

Hope you found this interesting, in coming articles will walk through other Medical image Formats as well and understand there challenges and Different CNN Architectures in solving them.

Keep Learning !!!

--

--