Sound the Alarm! Deep Learning & Ultrasound Scans

Published in

Stanford AI for Healthcare

9 min readFeb 5, 2018

You know what time it is? It’s deep learning time.

As a popular and cheap modality of diagnosis, ultrasound presents the opportunity to create a large database of medical images. Since many diagnoses done by radiologists can be framed as classification tasks, it is natural to attempt to apply machine learning (ML) to these images. That being said, as of right now, ultrasound imaging is not a modality which is being explored in-depth by the machine learning community. This blog post will:

Give you some background on ultrasound
Explain what the current state-of-the-art in ML applied to ultrasound is
Outline current challenges and open questions in the problem

What is Ultrasound?

Ultrasound is a technique in which a transducer that emits ultra-high frequency sound wave is placed on the skin [14]. The sound waves reflect off of the organ boundaries in the body and are in-turn picked up by the transducer. The time from initial emission of the wave and the return time allows scanner to create an image of the inside of the body.

A B-mode ultrasound scan of the liver, taken by a GE LOGIQ machine

There are two main “flavors” of ultrasound: B-mode and Doppler. In B-mode ultrasound, the reflected sound waves create simple a still image of the anatomy. The still images are black-and-white, making them 2-dimensional. Meanwhile, in Doppler ultrasound, the distortion of the sound waves due to movement in the body is used to show the flow of fluids, such as blood through the veins. These movements are color-coded, making Doppler ultrasound scans 3-dimensional. In both B-Mode and Doppler, still images can be taken in quick succession to create videos.

Ultrasound can be used to take simple images of the anatomy (known as diagnostic ultrasound), or can even give more complex information about the body, such as flow of blood or softness of tissue (known as functional ultrasound). Ultrasound can also be used interact with tissue through the use of high-intensity sound beams (known as therapeutic ultrasound) — an example of such an interaction would be destroying blood clots, although this is not common in practice. Anatomical and function ultrasound produces images/videos that we can apply machine learning on, while therapeutic ultrasound does not.

Ultrasound is a relatively inexpensive and portable modality of diagnosis. The procedure is non-invasive and quickly gives radiologists information necessary to make diagnoses. Sonography machines are being made smaller and smaller (see: an ultrasound probe that can attach to a smartphone), making them more and more accessible to developing countries. As such, achieving radiology-level performance on ultrasound images would deliver impactful and feasible medical solutions to such countries.

Check out this NIH resource to learn more about ultrasound imaging: https://www.nibib.nih.gov/science-education/science-topics/ultrasound.

Working with Ultrasound Data

Ultrasound images generally come in a file format known as DICOM. This is a standard medical imaging format that stores the pixel values for scans produced by various modalities, as well as additional parameters about the test. They are saved as ‘.dcm’ files. A great package to use to work with this format in Python is the PyDicom package.

There is not a lot of preprocessing necessary for ultrasound images, unlike modalities such as CT. Because the pixel values stored in the DICOM files directly reflect how a radiologist might see it, it is generally fine to keep the images as they are. Plus, the grainy nature of ultrasound images makes it quite difficult to isolate certain structures, such as veins, using traditional computer vision techniques, making preprocessing cumbersome.

Current Work

Although ultrasound has not be heavily explored in ML, there are a few papers which make up the current state-of-the-art on the task. Below, some recent work on the task is discussed.

Identification

The most basic task for ML applied to ultrasound images is identification: given a scan, identify whether or not (and often times where) it contains an abnormality. Surprisingly, prior to around 2010, there seemed to be a push to apply neural networks (often referred to as Artificial Neural Networks, or ANNs, in medical papers) to help solve medical problems. During this time, there were papers on applying ANNs to ultrasound images to identify liver diseases [1], prostatic cancer [2], breast nodules[3], and deep vein thrombosis (DVT) [4]. These ANNs were what we call Fully-Connected Networks today — the most basic type of network which inputs a vector of real numbers and has no weight sharing (such as in Convolutional Networks, used for images). In other words, these papers have models which would extract numerical features from the ultrasound images as a preprocessing step, before passing those vectors through a shallow Fully-Connected Network.

Example from [8] of possible kidneys in ultrasound scan (Ravishankar et al.)

About five years later, there was a period where traditional ML techniques were applied to the ultrasound problem: using Binary Decision Trees to detect DVT [5], and logistic regression [6] and SVMs [7] to classify breast tumors. It was only until recently that deep learning has been applied to identification on ultrasound. To detect kidneys in ultrasound images, Ravishankar et al. use a method called transfer learning to fine-tune a CNN pretrained on ImageNet [8]. Zheng et al. also use transfer learning to detect abnormalities in kidney and urinary tract ultrasounds [9]. These early results are an extremely promising indication that ultrasounds are a prime nail to hit with the CNN hammer!

Generation

A common application of ML to ultrasound is noise reduction: increasing the quality and/or resolution of the low-quality ultrasound scans. Models also attempt to create ultrasound images from auxiliary inputs. These models can be grouped under the banner of “generative”, as they rely on convolutional networks which generate images in order to accomplish their task.

A recent paper makes use of a series of Generative Adversarial Networks to transform echogenicity maps into realistic ultrasound images [10]. Their pipeline consists of a physics-based B-mode simulator and two GANS which incrementally refine the initial simulation until a realistic output is achieved. This approach solves the problem that current systems of simulations face: the need to solve computationally intractable equations in order to produce the simulations. In this system, the GANs can produce the realistic scan output instantly given the B-mode simulation.

Diagram from [10], which uses GANs to produce realistic ultrasound images (Tom and Sheet)

There is also work in increasing the quality of medical images. A recent paper uses Convolutional Networks to transform speckled, blurry ultrasound images into CT-quality images [12]; an even more recent paper uses a fairly simple convolutional architecture to increase the resolution of portable ultrasound machines [11]. These papers attempt to leverage the accessibility of ultrasound modality to produce superior images— literally, using AI to make ultrasound a better method of diagnosis!

Segmentation

Segmentation is the task of taking an input and highlighting regions of interest. In the context of medicine, this may mean coloring a problematic area in a scan in order to call attention to it. For the specific modality of ultrasound, a popular segmentation challenge is finding cancerous tumors in breast ultrasound (BUS) scans. A recent benchmark study tested and compared the effectiveness of various machine learning approaches to the task by aggregating a fairly large (562 images) dataset of B-mode BUS scan and testing the approaches on it [13].

The study compares the effectiveness of five state-of-the-art approaches which use domain-features and other traditional computer vision methods to do segmentation. It seems that work so far in ultrasound segmentation relies on traditional CV techniques; it will be interesting to see how this may change if and when deep learning is applied to this problem.

Challenges & Open Questions

Although ultrasound is certainly a cheap and convenient modality, there is not an abundance of labelled ultrasound images publicly available for machine learning tasks due to patient privacy regulations. As such, one of the biggest challenges is a lack of labelled data. This has been addressed so far by making use of transfer learning methods, in which a network trained on image classification is further trained to do classification on ultrasound images. An open question is whether or not systems trained on a large dataset of ultrasound images, given that it can be provided, would perform better than systems trained using transfer learning.

Because the amount of ultrasound data which is available is limited, an issue is also the fact that we don’t make use of one of the most interesting features of sonography: the ability to take videos. Given a sequence of ultrasound data, it would be a fantastic experiment to feed the images into a recurrent network at each time step to try to predict a diagnosis. The performance of a system trained on a time series of ultrasound images is still an open question.

Could ML do better on Doppler ultrasound than B-mode?

Most of the data collected from ultrasound machines for machine learning tasks is B-mode ultrasound. How would a system trained on Doppler ultrasound perform? The added input of blood/fluid flow would give the machine learning system an additional feature upon which to make a prediction, possibly benefitting its performance. Whether or not this is actually this case is still an open question.

Conclusion

There is still a lot of work to be done on applying machine learning to ultrasound images. However, historical work and very recent resurgence of interest, in addition to the ease and practicality of the modality, make it an incredibly ripe problem for machine learning to tackle. As AI becomes more and more integral to healthcare, it will be interesting to see how diagnosis processes involving ultrasound are impacted — and how this impact can possibly benefit billions of people without access to doctors around the world.

Acknowledgements

I’d like to thank Matt Lungren MD MPH, Assistant Professor of Radiology at the Stanford University Medical Center, for his guidance and feedback throughout the writing process. I’d also like to thank Bhavik Patel, MD, MBA, Assistant Professor of Radiology at the Stanford University Medical Center, and Pranav Rajpurkar, Jeremy Irvin, Tanay Kothari, Aarti Bagul, & Nick Bien of the Stanford Machine Learning Group, for their comments.