Malaria Detection from blood sample images using Intel® Distribution of OpenVINO™ Toolkit.

INTRODUCTION

Pranab Sarkar
Intel Software Innovators
5 min readJul 10, 2019

--

Malaria is a life-threatening disease. It’s typically transmitted through the bite of an infected Anopheles mosquito. Infected mosquitoes carry the Plasmodium parasite. When this mosquito bites you, the parasite is released into your bloodstream.

Over 400,000 people die from malaria each year, mostly children under five years of age, with most of the malaria cases occurring in Sub-Saharan Africa. An estimated 300–600 million people suffer from malaria each year. More than 40 percent of the world’s population lives in malaria-risk areas.

Why we need AI ?

Various type of tests is performed to help diagnose malaria, to monitor for relapses, and to determine drug susceptibility of the parasite causing the infection.

Disadvantages with the legacy malaria detection tests

Microscopic examination remains the “gold standard” for laboratory confirmation of malaria. Based on the guidelines from the WHO protocol, this procedure involves intensive examination of the blood smear at a 100X magnification, where people manually count red blood cells that contain parasites out of 5000 cells. With the regular manual diagnosis of blood smears, it is an intensive manual process requiring proper expertise in classifying and counting the parasitized and uninfected cells. Typically this may not scale well and might cause problems if we do not have the right expertise in specific regions around the world.

The results of the Rapid tests are accurate in such a condition when the testing device is in proper temperature and maintained environment. The RDT may not be able to detect some infections with lower numbers of malaria parasites circulating in the patient’s bloodstream.

Missing a single parasite in what appears to be a parasite-free sample can be deadly. In cases where no parasites are found, blood-smears and cell counts are repeated every eight hours. If no parasites are found after three repetitions, the patient is cleared. This is done to minimize the number of missed diagnoses (false negatives) that could lead to death.

Hence, we can conclude that microscopic malaria detection is the best possible way, but there are many gaps which should be filled up. To solve this problem which is definitely an intensive manual process, It should be automated with the help of Intel-powered devices and frameworks to make the test’s result more accurate.

How AI and Intel is changing the world.

Overview and Architecture

Bringing computer vision and artificial intelligence to your IoT and edge device prototypes is now easier than ever with the enhanced capabilities of the new Intel® Neural Compute Stick. Whether you’re developing a smart camera, a drone with gesture-recognition capabilities, an industrial robot, or the next, must-have smart home device, the Intel NCS offers what you need to prototype smarter. What looks like a standard USB thumb drive hides much more inside.

Thus I am using Intel NCS with OpenVino Toolkit for Real-time, on-device inference where cloud connectivity is not required.

The overall architecture of the project is shown below:

Figure 1: The Architecture Flow

The Entire Workflow from data collection to model optimization:

Figure 2: Model Creation to Optimization
  1. Data Collection: The dataset contains 27558 images of individual cells (infected/ uninfected RBCs) segmented from thin blood smears.
  2. Data Augmentation and Data Preparation: Deep networks need a large amount of training data to achieve good performance. To build a powerful image classifier using very little training data, image augmentation is usually required to boost the performance of deep networks. Image augmentation artificially creates training images through different ways of processing or combination of multiple processing, such as random rotation, shifts, shear and flips etc.
  3. Defining the CNN Model:
Figure 3: Using Tensorflow 2.0 to define the model architecture.

4. Training the Model: I have used an Intel-powered laptop to train my model by splitting the data-set into 70% for training and 30% for testing. The training took about 6 hours to meet the convergence.

Figure 4: A screenshot while training the model

5. Converting the *.h5 file to *.pb: After successfully training and testing the model, I have saved the model by the name of cells.h5.

Figure 5: Saving the model using keras.

Now, we have to convert the cells.h5 file into a *.pb format. Thus we will use the following code to convert the model into frozen_model.pb.

Figure 6: Converting the .h5 file to .pb file

6. Model optimization using Intel® Distribution of OpenVINO™ Toolkit: The classifier algorithm helps to identify the cell properly. First, optimize the model to create a *.xml and *.bin file.

Figure 7: Diagram of the Intel® Distribution of OpenVINO™ toolkit workflow.

Now, we will use the following code to implement the model optimizer and get our desired files for the inference.

Figure 8: Using Model Optimizer to convert the .pb file into .xml and .bin file.

Inference Time

Here, we have our *.xml as well as the *.bin file ready which are required during inference. Thus in the following code snippet(Figure 9) we have imported the inference_engine from the openvino module and created a function pre_process_image for processing the input image and converting the same to the required format by the model.

Figure 9: Preparation for the inference.

Now, in the following code snippet(Figure 10) we will select our *.xml and *.bin file as well as choose Intel NCS(MYRIAD) as the inference hardware.

Figure 10: Choosing the device for the inference.

Finally, everything is now set and we can perform our inference easily.

Results

Thus I have performed the inference using 5000 sample images and the process took about 40.86 seconds. We can see that the performance is very good if we compare it with the time taken to analyze the entire batch of images under human supervision.

Figure 11: Calculating the time required for inference.

Conclusion

The above illustrates the importance of AI in automating and increasing the efficiency of the legacy malaria testing technique. There are many remote places throughout the globe which doesn’t have access to the internet, in these type of scenarios we need this remote device which will be the best solution to fill the gaps. This recipe using the model optimizer and inference engine of the Intel® Distribution of OpenVINO™ toolkit gives another dimension to this project. Hope this type of innovation saves many lives!

--

--