Detecting Breast Cancer with Convolutional Neural Nets

New tissue classifier could have significant clinical relevance

Breast tissue is divided into four categories: “fatty”, “scattered fibroglandular density”, “heterogeneously dense” and “extremely dense.” The latter two of these classifications are considered “dense,” which indicates the presence of more fibroglandular tissue. Tissue is normally “dense” in younger women and becomes more fatty after menopause. Of women above age 40, around 50% have “dense” breast tissue. Current research suggests that women with dense breast tissue are at higher risk for developing tumors. This is especially dangerous because dense breast tissue can conceal indicators of cancer on a mammogram.

Contributors to a recent publication on breast density classification include the Center for Data Science’s Nan Wu, Krzysztof J. Geras, also at the School of Medicine, Yiqiu Shen, Jingyi Su, and Kyunghyun Cho. Additional researchers in this publication are the NYU School of Medicine’s S. Gene Kim, Eric Kim, Stacey Wolfson, and Linda Moy.

With the eventual goal of using neural networks to diagnose breast cancer, recent research has attempted to use learning models to automate components of diagnosis to levels of human expertise. Classifying breast density is an important factor in breast cancer screening. While automated programs do currently exist to measure breast density, they are not learning models. In their recent publication, the authors demonstrated that using a convolutional neural network (CNN) trained on a large data set to classify breast density yields results comparable with radiologists. The data set used in this study contained over 200,000 screening mammography exams with four standard views used in screening mammography — an unprecedented size of clinically realistic training data set. Researchers have made the architecture of their network and its parameters available online.

Fatty breast tissue visually presents as darkness in imaging. This is because radiation is better absorbed by fatty breast tissue versus dense breast tissue. Researchers used pixel intensity histograms to train their classifier, leveraging the difference in radiation absorption to provide a baseline for classification. The researchers found that the robust size of the training data set improved performance, though not enormously. Due to the results that this CNN produced, the researchers conclude that the model may have significant clinical relevance.

By Sabrina de Silva