Brain Tumor Detector part 4

Published in

Software-Dev-Explore

3 min readNov 9, 2023

Introduction

Convolutional Neural Network short for CNN is a popular deep learning model for image classification task. The implementation of CNN from scratch is easy. However the accuracy of the model is around 80% in my case. A more complex model is required in order to reach higher accuracy.

Transfer Learning is a machine learning technique that a model learned from a task can be re-used on a different task. Later it will be used during fine-tuning.

A more complex model and transfer learning is my go to solution.

Code

Netbook with code

Create the model

Select a model

There list of pre-trained model in Keras, which is part of Tensorflow I can reference from. I have had tried MobileNet, InceptionV3 and EfficientNetV2B0 .

MobileNet: Smallest model size and lowest accuracy
InceptionV3: Biggest model size and highest accuracy
EfficientNetV2B0: Medium model size and medium accuracy

Selection between size and accuracy I pick EfficientNetV2B0.

Ceate the model

There is a problem for pre-trained model. A pre-trained EfficientNetV2B0 model was trained on ImageNet dataset which contain 1000 classes but my dataset only have 4 classes.

In order to solve this problem, I need to remove top layer in the pre-trained model and attached it with a new Dense layer which only output 4 classes.

Lines 4 ~ 8 adopting image augmentation technique to solve data scarcity and overfitting. In this case I flip, rotate, translate, zoom in/out and change brightness of training images.

Overfitting occur when a model learn well on training set but can not generalize well on test set

Line 9 ~ 10 include_top paramter set to false to not include top layer.

Line 11 ~ 13 is where I include my own layers. The last layer use softmax for activation since model output 4 classes in a probability distribution.

Freeze pre-trained model

Since I attached my own layers to the pre-trained model, it will lead to a problem gradient vanishing or exploding. In other words the pre-trained model forget what it had learned.

To overcome this issue, all layers in pre-trained model need to be freezed so their weights will not be updated during training iteration.

Training

Compile model

Before fitting the model on data, it need to be compiled.

label_smoothing here is a regularization to prevent the model overconfidence.

Train the model

Call fit method to train the model. Two callbacks are employed for each training iteration.

ReduceLROnPlateau : Reduce optimizer’s learning rate when the model is not improving
EarlyStopping : Stop training when the model is not improving

With these two callbacks, I can have a model which is neither overtrained nor undertrained.