Acute Leukaemia diagnosis using machine learning

Leukaemia and reduced blood efficiency

Published in

Ixor

5 min readJan 22, 2019

Acute Leukaemia (AL) is a cancer characterised by a strong proliferation of abnormal white blood cells (leukaemia cells) in the blood forming tissues. Unlike normal blood cells, the leukaemia cells don’t necessarily die when they should. Therefore they can crowd out the normal white and red blood cells in the blood stream, resulting in a reduced efficiency of the normal blood cells.[1]

Counting cells

One typical test in the AL diagnosis process is to check the ratio abnormal (blasts) vs normal blood cells in the bone marrow. The resulting value is also known as the blast count. Traditionally the counting is done manually under a microscope. A labour intensive process performed by specialised pathologists. In order to reduce costs and facilitate the life of pathologists, there is a demand to automate the blood count test. At IxorThink we are designing an algorithm that suits this purpose.

How does it work?

The blast counting model comprises two steps. The input of the test is a digital bone marrow smear image (size: 10^5 x10^5 pixels), generated by a whole slide scanner. The first step is to identify the optimal regions for cell counting. The second step is to actually identify and to count the different cell types. Eventually, the blast count ratio can be calculated using the results of the previous step. Below an outlay of the process.

Where do we count?

A suitable region for counting should have (i) a clear view on the internal cell structures and (ii) have few overlapping cells. Unsuitable regions for counting are often visible on the left half of the slide where the smear is too thick, causing for example the internal cell structures to be blurred out.

An example of an unsuitable (bad) and a suitable (good) region for cell counting

Our approach uses a xgboost algorithm to classify extracted tiles (1000 x 1000 pixels ) into suitable or unsuitable regions for counting cells. This based on their colour histogram and sharpness index. The tiles are only extracted between 65 - 85% of the width and 10 - 90% of the height of the whole slide image, this in order to save process time.

Output of the count region classifier: green are suitable regions for cell counting, blue are unsuitable regions

Cell localisation

Now that we found some interesting regions for counting, we need to localise the cells for our count. This is done by using a model that is based on Facebook’s Faster R-CNN, which is trained on predicting bounding boxes around the cells of interest. An example output of our detectron can be seen in the picture below.

Blast or no blast?

The final step the blast count determination is the actual classification of the cells into blasts (~leukaemia) and non blasts. This is done by feeding rescaled boxes of the detectron into our Resnet based classifier. Our classifier is trained to classify a cell into one of the following categories: Blast, Non Blast or Unidentifiable. The training data is provided and annotated by a team of pathologists from the University Health Network and the University of Toronto.

The blast count can now be calculated as

Under the current consensus an AL is diagnosed if the blast count is higher than 20% based on minimum 500 cells.

A high level overview of the output of our model. On the left the total counts per class ant the blood count (Ratio) can be seen. The yellow fields on the slide are the suitable regions for counting identified by our region classifier.

A detailed view of the output. The yellow field is one of the yellow fields on the picture above. The cells have been classified in one of the three classes: Blast, Non Blast or Unidentifiable

Next steps: Let the model say “I don’t know”

Because a high prediction reliability is requested by the medical sector, we would like to let the model say I don’t know, when the image differs too much from the training data. This in order to reduce false positives. Currently we are exploring a new adapted loss function as described in the Evidential Deep Learning to Quantify Classification Uncertainty paper.

The urge for this adaptation can be illustrated as following: a convolutional neural network (CNN)(=our classifier) is basically a function that generates an output for an input. If a CNN is trained to classify cats and dogs, it will give an estimate on how certain it is of an image being a cat or a dog. If you give the same CNN a picture of a horse, it will inevitably also make an estimation of the picture being a cat or a dog. Maybe it will say only 0% cat and only 5% dog, however it is equally possible that the outcome will be 80% cat and 20% dog as it has never seen a horse before during training.

In our case, a similar situation could bias the blast count, e.g when a piece of dust happens to be selected as a cell by our detectron. More updates on this will follow.

IxorThink portal

At Ixor we have developed an interactive portal to facilitate data science projects. Clients can easily upload and annotate their data in a controlled environment, which comes in handy in projects like this, where second opinions on annotations are required. Moreover our data scientist can easily extract the training data and upload the models test results, which can be reviewed and corrected again by the client.

Bibliography:

[1] SEER Cancer Stat Facts: Acute Myeloid Leukemia. National Cancer Institute. Bethesda, MD, https://seer.cancer.gov/statfacts/html/amyl.html

At IxorThink, the machine learning practice of Ixor, we are constantly trying to improve our methods to create state-of-the-art solutions. As a software-company we can provide stable products from proof-of-concept to deployment. Feel free to contact us for more information.