Brain Tumor Detector part 4
Introduction
Convolutional Neural Network short for CNN is a popular deep learning model for image classification task. The implementation of CNN from scratch is easy. However the accuracy of the model is around 80% in my case. A more complex model is required in order to reach higher accuracy.
Transfer Learning is a machine learning technique that a model learned from a task can be re-used on a different task. Later it will be used during fine-tuning.
A more complex model and transfer learning is my go to solution.
Code
Create the model
Select a model
There list of pre-trained model in Keras, which is part of Tensorflow I can reference from. I have had tried MobileNet, InceptionV3 and EfficientNetV2B0 .
- MobileNet: Smallest model size and lowest accuracy
- InceptionV3: Biggest model size and highest accuracy
- EfficientNetV2B0: Medium model size and medium accuracy
Selection between size and accuracy I pick EfficientNetV2B0.
Ceate the model
There is a problem for pre-trained model. A pre-trained EfficientNetV2B0 model was trained on ImageNet dataset which contain 1000 classes but my dataset only have 4 classes.
In order to solve this problem, I need to remove top layer in the pre-trained model and attached it with a new Dense layer which only output 4 classes.
Lines 4 ~ 8 adopting image augmentation technique to solve data scarcity and overfitting. In this case I flip, rotate, translate, zoom in/out and change brightness of training images.
Overfitting occur when a model learn well on training set but can not generalize well on test set
Line 9 ~ 10 include_top paramter set to false to not include top layer.
Line 11 ~ 13 is where I include my own layers. The last layer use softmax for activation since model output 4 classes in a probability distribution.
Freeze pre-trained model
Since I attached my own layers to the pre-trained model, it will lead to a problem gradient vanishing or exploding. In other words the pre-trained model forget what it had learned.
To overcome this issue, all layers in pre-trained model need to be freezed so their weights will not be updated during training iteration.
Training
Compile model
Before fitting the model on data, it need to be compiled.
label_smoothing here is a regularization to prevent the model overconfidence.
Train the model
Call fit method to train the model. Two callbacks are employed for each training iteration.
- ReduceLROnPlateau : Reduce optimizer’s learning rate when the model is not improving
- EarlyStopping : Stop training when the model is not improving
With these two callbacks, I can have a model which is neither overtrained nor undertrained.
The accuracy reach 91%.
Conclusion
By using more complex model than CNN, I can push accuracy to even higher 91%.
Next
Use fine-tuning technique for the model to have much better learning on the dataset.