Dakshina Ranjan Kisku
6 min readMay 24, 2020

COVID-19 Detection on Chest X-Ray and CT Scan Images of Coronavirus Suspected Individuals Using CNN and Image Processing based Data Augmentation

Coronavirus disease or COVID-19 is an infectious disease which came to light on December 31, 2019 when China informed to World Health Organization (WHO) about a pneumonia like infection due to unknown cause observed among people in Wuhan city of Hubei province in China. The coronavirus outbreak has so far infected 5.4+ million people and caused death of 343K people across the globe. The pandemic has spread to 185 countries with a number of deadly strain of coronavirus. India has reported 1,35,000+ corona positive cases and 3,800+ deaths so far. Due to deadly infectious nature of coronavirus, it is spreading rapidly among people who are exposed to COVID-19 infected individuals. The virus spreads through droplets of saliva or discharge of swab from the nose while a COVID-19 infected person coughs or sneezes. A COVID-19 infected person may experience dry cough, fever, headache, muscle pain, sore throat and mild to moderate respiratory illness. However, older people and those having underlying medical conditions like cardiovascular disease, diabetes, chronic respiratory disease and cancer are more exposed to develop serious illness.

Due to unknown cause of pneumonia type infection and ability to generate new strain by mutation, it is almost impossible to have a cure in the form of vaccine or medicine for COVID-19 patients. Therefore, according to WHO more tests are recommended and social distancing is started in practice among people in high alert zones of different countries affected by corona pandemic. In the affected countries, reverse transcription polymerase chain reaction or RT-PCR has been adopted as standard diagnostic method to detect viral nucleic acid as coronavirus infection in COVID-19 suspected individuals. The test takes 4–6 hours or even a whole day to give the results. As the test takes more time to generate the result compared to the time for spreading coronavirus among people and sometimes it gives false positive and true negative results, therefore, to test the COVID-19 infection rapidly and in more efficient way, chest X-Ray or/and CT scan images of COVID-19 suspected individuals could be an answer. Moreover, the number of RT-PCR tests and shortage of test kits compared to coronavirus infected persons make it inefficient.

In contrast, X-Ray and CT scan images are widely accepted traditional form of diagnosing individuals for a number of diseases is a common practice adopted by radiologists and medics in healthcare and in medical imaging. The X-Ray and CT scan technologies have been using for several decades since its inception in medical diagnosis. In many highly affected regions or countries, it is difficult to provide sufficient number of RT-PCR test kits for testing COVID-19 infection for thousands of corona suspected people. Therefore, to address this issue, COVID detection can be made from chest X-Ray and CT scan images of corona suspected individuals who are suffering from COVID-19 symptoms.

NIT Durgapur has come up with a solution to the problem of shortage of RT-PCR testing kits and developed an AI-based software with image processing based data augmentation technique for detecting COVID-19 infection in corona suspected persons. With this integrated framework, both X-Ray and CT scan images of chest can be tested for virus detection. This application makes use of multiple representations of same X-Ray and CT scan images, produced through image processing techniques, are mixed up with visible X-Ray and CT scan images for training the convolutional neural network (CNN) based deep learning model. This deep learning model has the ability to learn the underlying pattern of COVID-19 infected X-Ray and CT scan images in a more effective way from representative images as well as original images of the same person used for training. Moreover, with a simple configuration of CNN model, this software works well for a range of COVID-19 infected X-Ray and CT scan images of chest. The Figure 1 shows a chest X-Ray (top left) and its multiple representations of COVID-19 infected person whereas the Figure 2 shows a chest CT scan image (top left) and its multiple representations of COVID-19 infected person.

Figure 1. Chest X-Ray and its Multiple Representations of a COVID-19 Infected Individual.
Figure 2. Chest CT scan and its Multiple Representations of a COVID-19 Infected Individual.

The main objective of using deep learning model is to achieve higher accuracy of classification with chest X-Ray and CT scan images by separating the COVID-19 cases from non-COVID-19 cases. It is well-known that to train a deep model, someone needs a large number of example images of both COVID-19 and non-COVID-19 individuals for making the learning of the model about the patterns more effective. To achieve this target, a number of representative images are generated using image processing techniques and then these discontinuity information of these representations are mixed up with the original X-Ray and CT scan images separately and further, these large number of data augmentation is used to train the CNN based deep learning model. The databases of X-Ray and CT scan images are publicly available in GitHub repository for the purpose of scientific experiments. Both these datasets contain chest images of COVID-19 and non-COVID-19 individuals. The X-Ray database contains 67 COVID images and the same number of non-COVID images whereas CT scan database contains 345 COVID images and the same number of non-COVID images. To conduct the experiment, images are down sampled to 50×50 dimension from their original size. The random subsampling or holdout method is adopted to test the efficacy of the model. In holdout method, the whole dataset containing COVID positive and negative samples is divided into a number of ratios like 80:20, 70:30 and 60:40 as training and testing samples. In has been observed that when the number of training examples are increased, the model exhibits higher classification accuracy around 96% and 99% respectively for CT scan and X-Ray images. Moreover, this result exhibits more consistency while layers are being changed in CNN based deep model. To evaluate the framework in a robust and effective way, a number of evaluation metrics such as classification accuracy, loss, area under ROC curve (AUC), precision, recall, F1 score and confusion matrix has been used. The values of these metrics have been determined on different ratios of training and test samples considering a number of layers in deep model. The model is correctly able to classify the chest X-Ray and CT scan images of COVID-19 cases from non-COVID-19 cases.

The CNN based deep learning model uses three layers such as convolutional, pooling and fully connected layers. Two activation functions viz. RELU and sigmoid are used in the model. RELU is used after convolutional layer and sigmoid function is used for classification of test image into COVID and non-COVID classes. In training stage, the standard first-order stochastic gradient descent optimizer is used with a batch size of 32, maximum epochs 30 and binary cross entropy based loss function. In order to alleviate overfitting of the model, data augmentation is used for training the model using image processing techniques. This augmentation generates large number of representative images carrying discontinue information. Figure 3 shows the deep learning model with a number of parameters.

Figure 3. Convolutional Neural Network of Layer Size 32 and 64.

The proposed AI application has been tested on publicly available databases contributed by COVID and non-COVID individuals. The experimental results are found to be satisfactory and emerged as a useful software for COVID-19 detection on chest X-Ray and CT scan images of corona suspected population. The application can have the following usage

(a) Overcome the issues of shortage of RT-PCR kits

(b) Minimize the cost of testing

(c) Easy to use by diagnostic and medics persons

(d) Can be used for rapid testing

(e) The software application can be worked both in offline and online mode

We are in the process of further improving the application for robustness and reliability. So that the software can be deployed at the earliest for commercial use in healthcare sectors.

References:

1. https://www.who.int/health-topics/coronavirus#tab=tab_1

2. https://www.mygov.in/covid-19

3. https://www.verywellhealth.com/medical-imaging-of-covid-19-4801178

4. http://deeplearning.stanford.edu/tutorial/supervised/ConvolutionalNeuralNetwork/

5. https://www.sciencedirect.com/topics/engineering/image-processing

6. Rafael C. Gonzalez and Richard E. Woods, “Digital Image Processing” 4th Edition, Pearson, 2018.

Key Developers:

Ms. Kiran Purohit, 2nd Year M. Tech. Student, Dept. of CSE, NIT Durgapur

Mr. Abhishek Kesarwani, Institute PhD Scholar, Dept. of CSE, NIT Durgapur

Dr. Dakshina Ranjan Kisku, Associate Professor, Dept. of CSE, NIT Durgapur

Dr. Mamata Dalui, Assistant Professor, Dept. of CSE, NIT Durgapur