Histopathological Images from PCam.
Example of Histopathologic Images from PCam

Week 1 — Histopathological Cancer Detection

Tugay ÇALYAN
bbm406f19
Published in
3 min readDec 1, 2019

--

Hello everyone. Today we want to talk about our Machine Learning project. We are Tugay Calyan, Anil Aydingun, and Denizcan Bagdatlioglu.

Just couple days before due date

We will work with machine learning and deep neural networks to automatically detect metastasised cancer. This field of study is a promising field of medical imaging and diagnosis for clinical utility.

What is Metastasis ?

Metastasis means the spread of cancer from the point of onset to different parts of the body. Doctors can express this event in different discourses metastatic cancer, advanced cancer or stage 4 cancers. These terms may have different meanings, usually doctors are asked to explain where they spread.

Data Source

Original Source: Camelyon16

PCam is a subset of the Camelyon16 dataset consisting of 400 high-resolution whole-slide images of lymph node segments. Camelyon16 was prepared by gathering two different independent dataset from Radboud University Medical Center(Nijmegen, the Netherlands) and the University Medical Center Utrecht (Utrecht, the Netherlands). The following are excerpts from the Camelyon16 site https://camelyon16.grand-challenge.org/Data/.

The training dataset is divided in two. The first one consist of 170 WSIs lymph nodes (including 100 normal slides and 70 slides containing metastases). The second consist of 100 WSIs lymph nodes (including 60 normal slides and 40 slides containing metastases).

The test dataset consists of 130 WSIs which are collected from both Universities.

Example of a metastatic region from Camelyon16.

PatchCamelyon (PCam)

PCam was prepared by Bas Veeling, for practitioners who want to work on medical images using machine learning. Bas Veeling is a Phd student in machine learning for health from the Netherlands.

PCam consist of 327,680 color images (96x96 pixels) extracted from the histopathological scans of the lymph node sections. Each image is graded with a binary label (1 indicates that there are metastatic cancers, 0 indicates no metastatic cancers). PCam is larger than CIFAR10, smaller than ImageNET and can run on a single GPU.

From the author’s words:

Fundamental machine learning advancements are predominantly evaluated on straight-forward natural-image classification datasets. Think MNIST, CIFAR, SVHN. Medical imaging is becoming one of the major applications of ML and we believe it deserves a spot on the list of go-to ML datasets. Both to challenge future work, and to steer developments into directions that are beneficial for this domain.

We think PCam can play a role in this. It packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task, akin to CIFAR-10 and MNIST. Models can easily be trained on a single GPU in a couple hours, and achieve competitive scores in the Camelyon16 tasks of tumor detection and WSI diagnosis. Furthermore, the balance between task-difficulty and tractability makes it a prime suspect for fundamental machine learning research on topics as active learning, model uncertainty and explainability.

via GIPHY

What are we planning to do?

We want to load the data properly using the Pytorch library. We aim to see how the data set reacts to pretrained models in the Pytorch library before applying preprocesses on the data and select our model accordingly to increase success.

See you next week !

--

--