Week 5 — Histopathologic Cancer Detection

Published in

bbm406f19

4 min readDec 30, 2019

Hello everyone! We will share with you today the fifth series of our Machine Learning Course Project on Cancer Detection with Histopathological Data. This week, we will share our measurement metrics with you this week as we approach the end of our work. We will also experience our data set on a pre-trained model and inform you about our possible results. Then let’s get started without leaving you more excited!

Week 1 — Histopathologic Cancer Detection
Week 2 — Histopathologic Cancer Detection
Week 3— Histopathologic Cancer Detection
Week 4— Histopathologic Cancer Detection

Source: https://dramatictrainingsolutions.com/8-ways-to-measure-the-success-of-your-training-programs/

How We Will Evaluate Our Results?

We wonder how successful the results we have been working on and ultimately achieved throughout the project are. How do we evaluate the success of these results? First of all, there are many different methods for this. But some measurement metrics, known for their best assessment of classification problems, are a step forward for us. These are the Confusion Matrix, AUC and ROC metrics. Let’s learn together how these metrics work!

1- Confusion Matrix:

The confusion matrix has the form as follows:

TP in the matrix means true positive. True positive means that the prediction is positive and this prediction is correct.
FP in the matrix means false positive. False positive means that the prediction is positive and this prediction is not correct.
FN in the matrix means false negative. False negative means that the prediction is negative and this prediction is not correct.
TN in the matrix means true negative. True negative means that the prediction is negative and this prediction is correct.

We will obtain and interpret success measurements such as recall, precision, and accuracy with the confusion matrix to be created in the above format.

2- AUC — ROC Curve:

ROC is a probability curve, and the area under it is called AUC (area under the curve). Used to visualize the performance of models in classification problems, these metrics tell how much they can distinguish between classes. The higher the AUC, the better the model predicts when classifying.

Experienced Architecture

We came to the results of the pre-trained model we experienced. Let’s see what happened! As we explained in our previous blog post, using the Pytorch framework, we made a simple application on our dataset to see how successful we can achieve with a simple convolutional neural network architecture.

via GIPHY

First, we split our data into 20% validation and 80% train data. We used 176020 train images and 44005 validation images. We applied normalization to these images. We trained the model in 5 epochs. We used 0.002 as the learning rate. Our batch size is 128. The training of the model was as follows:

· We applied batch norms after convolutional layers. After this process, we did ReLu activation and then applied the pooling process. Finally, we have made our model ready for binary classification by using a fully connected layer.

You can examine the parameters of our CNN architecture with the help of the table below:

Our CNN architecture for a preliminary result.

In this case, the preliminary results of our project are as follows: An accuracy of 86.34% with our test data.

via GIPHY

Of course, there are ways to improve this result. For a generalization of the model, we can apply some transformations such as rotation and flip in addition to normalization. But even with a classic CNN architecture, we have seen that we can achieve good results!

Thank you very much for sharing our excitement and this process with us. We hope to share our final results with you next week, please stay tuned! Happy weeks :)

References

Understanding Confusion Matrix

When we get the data, after data cleaning, pre-processing and wrangling, the first step we do is to feed it to an…

towardsdatascience.com

Understanding AUC - ROC Curve

In Machine Learning, performance measurement is an essential task. So when it comes to a classification problem, we can…