Identifying Lung Diseases With X-Ray Images Using Neural Networks

6 min readMar 10, 2023

The year 2020 showed us how important and necessary a rapid diagnosis of various diseases is, both for the safety of patients and professionals and to meet a sudden high demand.

In this context, it is easy to think about how machine learning tools can help the health sector, performing pre-diagnoses using, for example, X-ray images.

With that in mind, this article will show a step-by-step process for training a neural network that will help us achieve this goal, from importing the database, tuning the model hyperparameters and finally arriving at the results.

We will use a image database from Kaggle that contains x-ray images for normal, covid, and viral pneumonia pacients. You can download it here.

To simplify and fix some errors in the dataset, we will create a “data” folder and put all the images from the training and test data in a corresponding folder for its category inside of it, ending up with this folder structure:

1. Creating the Dataset

First of all, we will use the OpenCV library to read the images into numpy arrays and insert it in our dataset with its corresponding category. Each step is commented bellow:

import numpy as np
import os
import cv2
tf.random.set_seed(0) # setting a seed ensures our model is reproductible for the same data

categories = ['covid','normal','viral_pneumonia'] # naming the categories
img_size = 128 # the image size we will use in our model
img_data = [] # initializing the variable tha we will append the data

def create_dataset(): # this function will create the dataset
    for category in categories: # iterarion through each category name we defined
        path = os.path.join('data', category) # the path of each categorie folder
        class_num = categories.index(category) # the index of te category (0,1,2) for each category.
        for img in os.listdir(path): # iterating each image in the folder
            img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE) # reading the image into a numpy ndarray, also turning it to grayscale to simplify the model
            new_array = cv2.resize(img_array, (img_size,img_size)) # resizing
            img_data.append([new_array,class_num]) # appending to our dataset variable
create_dataset()

2. Data Preprocessing

Our dataset is currently in the appending order, and that can lead to problems in training. To fix that, we just need to shuffle the data in a random manner.

import random
random.shuffle(img_data)

Splitting our data into features and labels

X = [] # features
y = [] # labels

for features, label in img_data:
    X.append(features)
    y.append(label)
    
X = np.array(X).reshape(-1,img_size,img_size,1)/255 #reshaping and normalizing our data
y = np.array(y)

The final shape of our data is:

X shape: (312, 128, 128, 1)
Y shape: (312,)

Splitting our data into training, test and validation

from sklearn.model_selection import train_test_split as tts

X_train0, X_test, y_train0, y_test = tts(X,y,test_size=0.20, random_state=0)
X_train, X_val, y_train, y_val = tts(X,y,test_size=0.1, random_state=0)

3. Model Building and Training

Now that out dataset is created, cleaned and normalized, its time to build our model.

In this example, we are going to use a simple Multilayer Perceptron model and tune its hyperparameters using the Keras Tuner tools. We’ll do that because its a faster and easy way to find better hyperparameters without the need to do it maually.

The step by step is:

Step 1 — Build a function with you model inside it.
Step 2 — Use the “hp” argument along with your desired hyperparameter space to define the hyperparameters during the model creation.
Step 3 — Initialize the tuner specifying the objective to select the best model, along with “max_trial” to set the number of different models to try.
Step 4 — Start the search and get the best model as an object.

in this example we are going to use the Bayesian Optimization tuner.

Commented code:

# importing the libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from keras_tuner.tuners import BayesianOptimization
from keras_tuner.engine.hyperparameters import HyperParameters

# step 1
def build_model(hp):
    model = Sequential()
    model.add(Flatten(input_shape=X.shape[1:]))
    
    # step 2 - we'll use these inside our hyperparameters in the model layers
    hp_activation = hp.Choice(name = 'activation', values = ['relu', 'tanh'])
    hp_layer_1 = hp.Int(name = "layer_1", min_value = 16, max_value = 128, step = 16) # here you set the minimum and maximum value of neurons in this layer, along with the step per iteration in the tuner.
    hp_layer_2 = hp.Int(name = "layer_2", min_value = 16, max_value = 128, step = 16)
    hp_layer_3 = hp.Int(name = "layer_3", min_value = 16, max_value = 128, step = 16)
    hp_dropout_1 = hp.Choice(name = "dropout_1", values = [0.0, 0.1, 0.2, 0.3]) # here the tuner will choose between a set o given values
    hp_dropout_2 = hp.Choice(name = "dropout_2", values = [0.0, 0.1, 0.2, 0.3])
    hp_learning_rate = hp.Choice(name = "learning_rate", values = [1e-2, 1e-3, 1e-4]) # same thing for the learning rates
    
    #use the arguments above in the layers below
    
    model.add(Dense(hp_layer_1, activation = hp_activation))
    model.add(Dropout(rate=hp_dropout_1))
    model.add(Dense(hp_layer_2, activation = hp_activation))
    model.add(Dropout(rate=hp_dropout_2))
    model.add(Dense(hp_layer_3, activation = hp_activation))
    model.add(Dropout(rate=hp_dropout_2))
    
    model.add(Dense(3, activation = "softmax")) # softmax output with 3 possible values (0, 1, 2, one for each disease category)
    
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer= tf.keras.optimizers.Adam(learning_rate = hp_learning_rate),
                  metrics=['accuracy']) # you could uuse other metrics for this problem here, but we'll use accuracy for simplification.
    return model

# step 3
tuner = BayesianOptimization(build_model,
                    objective = "val_accuracy", # thats the metric the tuner is going to use to the which hyperparameter fits best for our model objective.
                    max_trials = 16, # this will tell how many models will be tested before the tuner phase is over.
                    overwrite=True
                    )

# step 4 - start the search.
tuner.search(x = X_train,
             y = y_train,
             epochs = 64,
             batch_size = 8,
             validation_data = (X_val, y_val))

After the search is over, the tuner object will have our best hyperparameters according to the metrics we passed.

4. Final Model Training and Results

To see the results, we use the tuner.get_best_hyperparameters and save it to a variable.

Then we build our model using these best hyperparameters that we found and fit to the test data.

best_hps = tuner.get_best_hyperparameters(num_trials=1)[0] # getting the best hyperparameters
model = tuner.hypermodel.build(best_hps) # building the final model 
history = model.fit(X_train, y_train, epochs=128, validation_data = (X_test,y_test), verbose=0) # fitting the model
score = model.evaluate(X_test, y_test, verbose=0)

Results:

Test loss: 0.017198028042912483
Test accuracy: 1.0

The 100% test accuracy is probably due the a small sample size in the testing dataset.

Evaluating our confusion matrix:

y_prob = model.predict(X_test)
y_pred = np.argmax(y_prob, axis=-1)

from sklearn.metrics import confusion_matrix as cm
matrix = sns.heatmap(cm(y_test, y_pred), cmap='Blues', annot=True)
matrix

Ploting the loss and accuracy curves

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model 1 accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['accuracy', 'val_accuracy','loss','val_loss'], loc='upper right')
plt.show()

This model seems to be doing a good job at predicting the categories, even with a small sample.

5. Saving the Trained Model to a Pickle Object

To save it, we will just serialize to a pickle object and put it in our notebook directory.

import pickle
model_dump = pickle.dump(model, open(os.path.join('trained_mlp_model.pkl'), 'wb'))

Thais it! Our model is now trained.

You can access the full notebook and trained model here.

Useful links:

Keras tuner documentation: https://keras.io/keras_tuner/
What tuners method to use: https://towardsdatascience.com/grid-search-vs-random-search-vs-bayesian-optimization-2e68f57c3c46