Author: Michael Fitzke
1 in 4 dogs, and 1 in 5 cats, will develop cancer at some point in their lives. Pets today have a better chance of being successfully treated than ever, thanks to advances in early recognition, diagnosis and treatment.
AI and Cancer
“This is one of the biggest challenges in veterinary pathology. Do you think you can solve it?” Pathologists Dr. Edwards and Dr. Whitley asked in our first meeting. It was December 2018, and our team, Next Generation Technologies, had been founded that year to solve some of the most complex challenges at Mars through technology. At that point, our team of then just four people, with diverse backgrounds in machine learning, biostatistics, system design and operations, stared in fascination at the highly magnified tissue samples our colleagues from Antech Diagnostics presented to us. We listened intently as the pathologists described how the rate of cell duplication in a tumor is an important marker of severity, and how counting cells that are currently in the process of dividing (the process of mitosis) plays an important role in cancer grading.
The challenge: could we develop an AI system to detect and map mitotic figures for more reliable cancer prognosis?
Building the Model
At the beginning of any AI project there are critical decisions to be made that influence design aspects of the system. We familiarized ourselves with academic work (Cirecsan et al., 2013), with available industry systems and how compatible they were with the challenge we faced. We gathered some useful insights, but the one big difference we faced was the inference speed. Our colleagues at Antech look at hundreds of cancer cases and thousands of biopsies each day, so we knew that speed would have to play large role in design consideration. A standard academia modelling approach would already have a fixed data set, but we had to collect and label data, and how data collection would be handled was dependent on our guidance. [i]Andrew Ng makes our point:
Label-Fit-Check-Repeat
We decided to count mitotic figures using a detection model. Having worked with fully convolutional networks previously and knowing that this approach would be fast enough in prediction, we recognized that this would be a feasible option. Next, we looked at the labeling approach of this model, we could either label mitotic figures with a dot in the center or use the more labor-intensive labeling of the full segmentation mask of mitotic figures. We chose the latter because the information flow from the image to the artificial neural network is greater than when using a mask to mark mitotic figures, and because post-processing options were increased by knowing the full prediction mask, such as shape based second stage classifiers, color filters and size statistics. We wanted to have the option to easily produce synthetic data by using generative models like GANs to add to the training data, so the relatively large control and easy implementation of a pix2pix style network were also contributing factors.
We took the initial data points that pathologists had labeled, extracted 150x150 pixel image snippets, labeled the full mask and started to build the first model with 56 positive and 128 negative examples.
At this stage, we chose PyTorch as the framework to build our models. It was clear from the start that a lot of prototyping would be necessary to build the model. PyTorch has a great toolset to debug and inspect models, and a seamless API to try out ideas that allowed us to test different ideas and concepts.
The general architecture used was Encoder-Decoder Networks with Skip Connections, as laid out in Ronneberger et al.’s U-Net paper (Ronneberger et al., 2015). While our first model was very similar to this architecture, over time we changed modules and added new pathways. The fact that the PyTorch community was growing so rapidly and amazing new architectures like U-Net++ (Zhou et al. , 2018) and Attention U-Net (Oktay et al., 2018) were available really helped us along the way.
Starting with the original labeled data set we fitted the first model and took a large set of unlabeled whole slide images. We then produced predictions for those slides and discussed the results with the pathologists at Antech. These conversations were crucial because they pointed us in the direction of systematic failures of our intermediate models that we could then eliminate.
To avoid a situation where a false negative of our first model would find its way into the training set, pathologists also labeled mitotic figures that the model didn’t detect in each iteration. However, false positives were still an issue, and it took a lot of considerate data collection to arrive at our current production model.
Current Mitotic Feature Detector
Our final production system leverages three artificial neural networks. We trained all of them by using pre-trained feature extractors as encoders.
- Efficient-net-b5
- Efficient-net-b3
- ResNext-50
We kept the general encoder-decoder architecture with skip connections. Our data set consists of images that are 150x150 pixels as well as images that are 600x600 pixels. Due to the distribution probabilities of mitotic figures, smaller images are often very easy to label by domain experts while larger images take more time. However, speed and accuracy advantages can be realized by using larger images in inference. We trained all our networks by sampling batches from both subsets of our data and backpropagate the combined loss.
No Model Is an Island
The goal of our work was to add value for pathologists in clinical practice, so it was clear that a good detection model alone wouldn’t be enough. An AI system consists of multiple steps and components and must run quickly and robustly to be of value.
Due to the immense size of pathology images, we installed a cluster of GPU-backed servers that are physically located within the digital scanners. When new images are scanned our system gets notified and the image gets downloaded to an available server in the cluster.
The system then performs a multi-step process consisting of
1. Receiving and preparing the image
2. Count-or-no count slide classification
3. Tissue detection
4. Mitotic-figure detection
5. Post-processing
6. 10-high power field selection
7. Feedback of results
1. Receiving and Preparing the Image
Whole Slide Images (WSI) are digital tissue samples. They are prepared by staining the tissue sample and then scanning it with high performance scanners.
An advantage of Mars integrating aspects of veterinary care and AI is our ability to control a lot of the data variability and quality that makes mitotic figure counting in general a difficult challenge to solve.
2. Count-or-No Count Slide Classification
Since mitotic figure counting only needs to occur on cancer tissue, we could discard all whole slide images that don’t include cancer. To do that we developed the count-or-no-count classifiers. The pathologists labelled the slides that had to be counted, and slides that didn’t, in their normal workflow and then extracted images as 224x224 thumbnails and labels.
We trained a Resnet-18 from Torchvison that was pre-trained with Imagenet.
1. Tissue Detection
We only wanted to run the computationally expensive mitotic figure detector on actual tissue and not over background. To achieve that, we lowered the resolution so that each pixel represented a sliding window of the detector. We then applied thresholding to each pixel and returned the coordinates of pixels that include tissue.
2. Running the Detector Over the Slide
The mitotic figure detector runs over the tissue in the whole slide image in a sliding window process. For each sliding window our three networks predict a pixel mask that is averaged. We leveraged PyTorch’s dataloader capabilities and optimized workers and image size, with regards to inference speed.
3. Post Processing
For each prediction mask, we perform thresholding and run post-processing steps:
- Instance detection
- Excluding small sized structures
- Combination of close structures into one mitotic figure
We use a mixture of deterministic algorithms and deep learning-based classifiers in this step.
4. Finding the densest area
In digital pathology reporting the maximum number of mitotic figures that we can find in 10 high power fields (a square region of 2.37mm²) is standard for the ‘mitotic count’.
Finding this area of highest density can be very computationally expensive when dealing with many mitotic figures. We solved this by placing the square region at the upper left of each mitotic figure and then sliding the square up for each mitotic figure that is less than 2.37mm above the initial mitotic figure.
To find the mitotic figures quickly we organized their coordinates in a k-d-tree. For extremely large numbers of mitotic figures we also leveraged random search with early stopping when a very high mitotic count is reached.
Sending the result back to the pathologists
After running our inference and processing the slides we wrote the results into a database. The results are then directly integrated in the workflow of pathologists, so they see the annotations when they open a new case for a pet.
Pathologist Feedback
“The AI algorithms have improved my efficiency as a diagnostic pathologist. Previously, figuring out the mitotic index of any tumors would have been a tedious task, this is especially true for large tumors. For example, a large mammary tumor from a mastectomy can span multiple digital slides, which means I would have to look through many areas at high magnification to determine the region with the highest density of mitotic figures. I would then have to tally up the mitotic figures within the region manually. In a diagnostic setting where pathologists are reading anywhere from 50–80 cases a day, these tasks take up significant amounts of precious time .”— Dr Wilson Yau, Anatomic Pathologist at Antech Diagnostics
“Having the algorithms determine the highest density of mitotic figures on each slide, and annotate them before I even look at the slides for the first time, has been incredibly helpful. I’m now able to dive right into the pre-selected areas, validate the annotated figures by taking into account false positives/negatives and non-neoplastic areas, and then quickly calculate the mitotic index for a tumor, no matter how large or heterogeneous the tumor is. I’m now also more confident about the reliability and accuracy of these mitotic counts. On any given day, the algorithms save up to an hour of my time helping me determine mitotic indices, which allows me extra time to focus on other parts of my job, such as communicating with clinicians to determine next-step diagnoses and treatment for pets.” — Dr Cindy Bacmeister, Veterinary Pathologist at Antech