Brain Tumor Classification

12 min readApr 13, 2020

Abstract :

A brain tumor is considered as one of the aggressive diseases, among children and adults. Brain tumors account for 85% to 90% of all primary Central Nervous System(CNS) tumors. Every year, around 11,700 people are diagnosed with brain tumors. The 5-year survival rate for people with a cancerous brain or CNS tumor is approximately 34% for men and 36% for women. Brain Tumors are classified as Benign Tumor, Malignant Tumor, Pituitary Tumor, etc. Proper treatment planning and accurate diagnostics should be implemented to improve the life expectancy of the patients.
The best technique to detect brain tumors is Magnetic Resonance Imaging (MRI). A huge amount of data images are generated through the scans. These images are examined by the radiologist. A manual examination can be error-prone due to the level of complexities involved in brain tumors and their properties.

Application of automated classification techniques using Machine Learning(ML) and Artificial Intelligence(AI) has consistently shown higher accuracy than manual classification. Hence we propose performing detection and classification of Brain Tumors by use of Deep Learning Algorithms using Convolution Neural Network (CNN), Artificial Neural Network (ANN) and Transfer Learning(TL) to achieve higher accuracy. The MRI images are classified using different ‘Deep Learning Models of ANN and CNN’. These models have permutations and combinations of different ‘Network Parameters’. The model with the highest accuracy is selected for further improvements by fine-tuning the hyper-parameters of the network. The aim of the project is to achieve higher accuracy and reliability for real-world MRI data using AI and ML domain knowledge. Further to provide some suggestions for treatment by providing ease of access to the software through the cloud via mobile applications, web browsers platforms.

Problem Definition :

To Detect and Classify Brain Tumor using CNN, ANN and TL as an asset of Deep Learning and to deploy a Flask system for so.

Firstly, let us see why classification of brain tumors is soo complex!
Can you classify these MRI’s?

Pretty hard to find any difference?Right?

So, the first MRI has Glioma and the second MRI has Meningioma. Here we can understand how difficult it is to spot the difference among tumor types. Misclassification of tumors may lead to the prescription of wrong treatment and medicines which could make the condition worse.

Approach :

Hence we propose to build a system which would classify an MRI into one of the following classes of tumor:

Benign
Malignant
No Tumor
Pituitary Tumor

We build multiple ANN(Artificial Neural Network), CNN(Convolution Neural Network), TL(Transfer Learning) models and compare the accuracy, loss, and F1-Scores of each model to see which model is better than the rest.

LET’S BEGIN CODING!

Data:

Our project aims to classify MRIs into four classes which makes it a four-class problem. Data for our Neural Networks is in image format(.jpg). This data is collected from Kaggle(yes/no) and Github(private).

Overall, this Data-set was used.

This data is unclean and has few incorrectly placed images, which were corrected, deleted or replaced. Also, a few images from other medical publications were added. The amount of data in the No-Tumor class was too low, hence few images of no-tumor were added from Google. Getting an approximately equal number of images in each class is necessary to avoid any bias in the neural networks.

Pre-Processing:

The image data gathered from GitHub has three classes of the tumor while Kaggle data has the class of no-tumor. These images were split in the ratio of 7:3 for the Training and Testing phase respectively.

Image Cropping:

The MRIs contain a black background around the central image of the brain. This black background provides no useful information about the tumor and would be waste if fed to neural networks. Hence cropping the images around the main contour would be useful. For this we use cv2.findContours() from the ‘cv2’ library.

Here we can see how the biggest contour is selected and marked. Next, we find the extreme points of the contour and crop the image on those endpoints. Thus we have removed most of the unwanted background and some noise present in the original image. This process is done for each image in the dataset. However, note that sometimes cv2.findContours() may not be able to correctly recognize the correct contours and makes a mistake and wrongly crops the image. Such images should be removed by manual inspection before entering the ‘augmentation’ phase.

Augmentation:

The amount of data gathered was very low and could cause the models to under-fit. Hence, we would use a brilliant technique of Data Augmentation to increase the amount of data. This technique relies on rotations, flips, change in exposure, etc to create similar images. Using this technique we can increase the size of data by a high factor.

The output image of the cropping stage is given as input to ImageDataGenerator which is a function in keras.preprocessing.image library. This function takes multiple arguments that decide how Augmentation takes place.

The output of ImageDataGenerator with defined necessary arguments is multiple images which are then saved to a separate folder. The process of augmentation is applied to every image of the data-set hence increasing the size of the data-set. Thus with such a huge amount of data, the chances of underfitting can be lowered.

Pickling :

The data after all the above steps are still in separate image file formats stored in respective folders. But to feed the data to a Neural Network it must be converted into NumPy array. Thus we convert these images into four NumPy arrays of X_train, Y_train, X_test, Y_test. Where X_train, X_test contain the images and Y_train, Y_test contain the labels. The data is read using cv2.imread(path_of_image,cv2.IMREAD_COLOR). Here cv2.IMREAD_COLOR is important, as it would specify the ‘dimension of our input later’ of the neural network. Next we use np.reshape(-1,IMG_SIZE,IMG_SIZE,3) to reshape the data stored in X_train, X_test. The number 3 indicates this is an RGB image and not a greyscale image. The IMG_SIZE is to specify the resolution of images, higher resolution images contain more detail but at cost of greater pickle size and lower IMG_SIZE may lead to pixelated images but saves storage and makes the network light. So it’s important to find a sweet spot for resolution. (We selected this as 150) Once this processing, reshape and splitting of images and labels of all images in train and test set is done we save the files as pickle files.

Saving these files as pickle helps us save all our progress in the pre-processing stage using ‘pickle.dump()’. And we can later directly use these pickle files to be input to all our neural networks. Thus it helps us save time by not pre-processing every time we want to train a network. The next time we just use ‘pickle.load’ to load all the data.

Model Building :

The interesting part! So until now we have cleaned, pre-processed, augmented, pickled the data and now we are ready to feed it to a ‘Neural Network’ and see how it performs! We will try three different yet similar techniques and see which technique generates models that have higher accuracies.

This process of creating multiple models was done for both, Non-Augmented Data and Augmented Data, to better understand how augmentation helps in increasing accuracy.

Artificial Neural Network (ANN):

Artificial Neural Networks(ANN) are the simplest form of Neural Networks consisting of multi-layer fully-connected layer structure. They consist of an input layer that takes each pixel of the input image, multiple hidden layers for calculations, and an output layer which outputs four probabilities, each of one class. Every node in one layer is connected to every other node in the next layer. We make the network deeper by increasing the number of hidden layers, thus adding more computation and weight to the network.

For every training example, perform a forward pass using the current weights and calculate the output of each node going from left to right. The final output is the value of the last node. Compare the final output with the actual target in the training data, and measure the error using a loss function. Perform a backward pass from right to left and propagate the error to every individual node using backpropagation. Calculate each weight’s contribution to the error, and adjust the weights accordingly using gradient descent. Propagate the error gradients back starting from the last layer.

Thus we generate few models consisting of 32,64,128 nodes and 0,1,2 dense layers in combination. This provides us with a total of 9 models. The model with the highest accuracy is model with 128 nodes-0 dense layers with an accuracy of 80% Not bad for ANN. But the accuracy does not go over 80% no matter how many layers we add. Thus lack of Convolution layers stops our thrive for higher accuracy.

densel = [0,1,2]
layers = [32,64,128]for dense in densel:    for layer in layers:      NAME = "no_nodes-{}-no_dense-{}".format(layer, dense)
      print(NAME)
      model = Sequential()  
      model.add(Flatten(input_shape=X_train.shape[1:]))      for _ in range(dense_layer):         model.add(Dense(layer))
         model.add(Activation('relu')) 
         model.add(Dropout(0.25))
         model.add(Dense(4))      model.add(Activation('softmax'))
      #Compile      model.compile(loss='sparse_categorical_crossentropy',optimizer = "adam", metrics=['accuracy'],)      #Fit the model      history = model.fit(X_train, Y_train, batch_size=32, epochs=25, validation_data=(X_test,Y_test),callbacks [tensorboard,es])      #Score      scores = model.evaluate(X_test, Y_test, verbose=1)
      print('Test loss:', scores[0])
      print('Test accuracy:', scores[1])      #Save model
      model.save("{}-model-{}-accuracy.h5".format(NAME,scores[1]))      #precision    recall  f1-score   support      y_pred = model.predict(X_test, batch_size=64, verbose=1)
      y_pred_bool = np.argmax(y_pred, axis=1)   
      print(classification_report(Y_test, y_pred_bool))
      print()
      print()

Convolution Neural Networks (CNN) :

Convolutional Neural Networks (CNN) is one of the variants of neural networks used heavily in the field of Computer Vision. It derives its name from the type of hidden layers it consists of. The hidden layers of a CNN typically consist of convolutional layers, pooling layers, fully connected layers, and normalization layers. CNN's are very good at feature extraction, thus the can learn features of all four classes.

The different types of layers in CNN like the pooling layer help in extracting the important features from the image. Hence adding convolution layers with pooling layers over top of a simple neural network makes it great for image recognization.

The training of CNNs then becomes the task of learning filters. Pooling layers consider a block of input data and simply pass on the maximum value. Doing this reduces the size of the output and requires no added parameters to learn, so pooling layers are often used to regulate the size of the network and keep the system below a computational limit. An activation function provides non-linearity to the function.

Thus we generate a few CNN models of dense layers of 0,1,2 with layer size of 32,64,128 and a number of convolution layers of 1,2,3. The models are generated with a combination of these parameters. Thus we generate a total of 21 models. In which the model ‘3-conv-128-nodes-1-dense’ has the highest accuracy of 91%

Now we know that a model having 3 Convolution layers and 1 Dense layer of 128 nodes shows the possibility to be a better model than the previous ANN model. Thus we now fine-tune the hyperparameters of this model to get higher accuracy.

densel = [0,1,2]
layers = [32,64,128]
convl = [1,2,3]for dense in densel:  for layer in layers:    for conv in convl:      NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv,layer,dense)
      print(NAME)      model = Sequential()      model.add(Conv2D(layer, (3, 3), input_shape=X_train.shape[1:]))      model.add(Activation('relu'))
      model.add(MaxPooling2D(pool_size=(2, 2)))      for l in range(conv-1):
          model.add(Conv2D(layer, (3, 3))) 
          model.add(Activation('relu'))           
          model.add(MaxPooling2D(pool_size=(2, 2)))  
          model.add(Flatten())      for _ in range(dense):
          model.add(Dense(layer))
          model.add(Activation('relu'))
          model.add(Dropout(0.25))
          model.add(Dense(4))
          model.add(Activation('softmax'))      model.compile(loss='sparse_categorical_crossentropy',
      optimizer= "adam", metrics=['accuracy'],)      #Fit the model      history = model.fit(X_train, Y_train, batch_size=32, epochs=20,validation_data=(X_test,Y_test),callbacks=[tensorboard,es])

Transfer Learning (TL):

Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task. There are many pre-trained models available which are really good at classifying images, we can use these pre-trained models to classify MRI using transfer learning. The pre-trained models are available in ‘keras.applications’.

Firstly we need to import these models and convert them to keras’s Sequential model. The input layer of these pre-trained models may vary in input shape so we need to change the input shape as per our (IMG_SIZE, IMG_SIZE,3). Also, the output layer of these pre-trained models is configured to classify 1000 class of images but we have only four classes. So we pop the last layer and add a dense layer of four nodes and ‘softmax’ activation function.

Now we set a trainable parameter of each layer as False and run the model for a few epochs. This model should perform better than all our previous models as it is well trained and its hyperparameters are finely tuned. The only issue with using TL is the model size and depth are greater than simple ANN, CNN models.

We trained only on Augmented data this time on ResNet50, MobileNet_v2, VGG16, VGG19, InceptionV3. Out of which VGG16 had the highest testing accuracy of 94% and an F1-Score of 94! Thus this model outperformed all our ANN, CNN and TL models.

vgg16_model = vgg16.VGG16()
model = Sequential()for layer in vgg16_model.layers[:-1]:
     model.add(layer) # convert to sequantial modelfor layer in model.layers:
     layer.trainable = False
     model.add(Dense(4,activation = 'softmax'))model.compile(loss='sparse_categorical_crossentropy',optimizer= "adam", metrics=['accuracy'],)history = model.fit(X_train, Y_train,batch_size=32, epochs=20, validation_data=(X_test,Y_test), callbacks=[tensorboard])

Summary :

So we have created multiple models using ANN, CNN, TL. These models have various accuracies and F1-Scores. To best use the features of each model, we use ‘Ensemble Learning’! Ensemble Learning is a process using which multiple machine learning models are strategically constructed to solve a particular problem. They combine the decisions from multiple models to improve the overall performance.

Ensemble Learning strategy of voting is simple and easy to understand strategy. Here the top few (3 in our case) models are selected and each model is given the same image to predict the output. The predictions by each model are considered as a ‘vote’. The predictions which we get from the majority of the models are used as the final prediction.

Flask Deployment :

So, everything up till now runs over a command prompt and there is no front end to transform this project into a complete system. Let’s use Flask- a Web Application Framework based on python. Using Flask we can connect these models with a GUI on the browser. Once the model runs locally we can later deploy it on AWS, Heroku, Google Cloud, etc.

ALLOWED_EXTENSIONS = set(['jpg', 'jpeg', 'png'])
IMG_SIZE = 150 # size of images
UPLOAD_FOLDER = 'uploads'
model = load_model("model/best_model.h5")def allowed_file(filename):
    return '.' in filename and \
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONSdef prepare(filepath):
    img_array = cv2.imread(filepath, cv2.IMREAD_COLOR)  
    new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE)) 
    return new_array.reshape(1, IMG_SIZE, IMG_SIZE, 3)def predict(file):
    prediction = model.predict([prepare(file)])
    if np.argmax(prediction)==0:    
        output="glioma"
    if np.argmax(prediction)==1:
        output="meningioma"
    if np.argmax(prediction)==2:
        output="no_tumor"
    if np.argmax(prediction)==3:
        output="pituitary" 
    return outputapp = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER@app.route("/")
def template_test():
    return render_template('home.html', label='', imagesource='file://null')@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        file = request.files['file']
        
        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)
            file_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
            file.save(file_path)
            output = predict(file_path) #call predict fn
    return render_template("home.html", label=output, imagesource=file_path)@app.route('/uploads/<filename>')
def uploaded_file(filename):
    return send_from_directory(app.config['UPLOAD_FOLDER'],filename)if __name__ == "__main__":
    app.run(debug=False, threaded=False)

The above code is saved as ‘app.py’ in Flask Project. This python file initializes our Flask app and renders an HTML template where we can upload an MRI and it would predict the tumor class.

Run the app.py file of Flask we navigate to http://127.0.0.1:5000/ where the flask app is launched. We test it with a random image and see the prediction.

We can see the class of tumor predicted by our Flask is correct!

Conclusion :

Finally, we have created ANN, CNN, TL models that are successful to classify Brain Tumors and we connected Flask for a web-based service.

Links:

Website of the project