5 Minutes with Judith Goldberg

“We identified risk factors for breast cancer that have held the test of time and estimated the benefits of screening on mortality.”

Published in

Center for Data Science

3 min readMar 28, 2018

Judith D. Goldberg, a Center for Data Science Affiliated Faculty Member, is a Professor of Biostatistics in NYU Langone Medical Center’s Departments of Population Health and Department of Environmental Medicine. She graduated with a Sc.D. from Harvard in 1972 and has held leadership positions at Bristol Meyers Squibb and Lederle Laboratories, was an Associate Professor at Mount Sinai School of Medicine, and began her career as a statistician at the Health Insurance Plan of Greater New York. She was Founding Director of the Division of Biostatistics at NYU Langone from 1999 to 2013. Her research has focused on statistical methods for the evaluation of screening and diagnostic tests, design and analysis of clinical trials, observational studies, and more. Her collaborative research crosses areas from oncology to cardiovascular disease. She is a Fellow of the AAAS and of the American Statistical Association and received the 2015 Janet L. Norwood Award for Outstanding Achievement by a Woman in the Statistical Sciences and the 2016 Lagakos Outstanding Alumni Award from the Harvard TC Chan School of Public Health.

1. In your own research, how has data science enabled new ways of detecting, preventing, and treating disease?

I include statistics as a key component of data science. The explosion of data from multiple sources, much of which results from new high throughput technologies that allow us to evaluate genetic components of disease incidence and prognosis has expanded what we can incorporate into clinical trials and observational studies. In particular, the computational advances in data science allow the analysis of gene expression data, proteomics, metabolomics, and other technologies, and the integration and analysis of data across multiple databases as well as data mining for exploratory analyses. The integration across all of these types of data allow us to develop models that could enhance the prediction of disease occurrence and outcomes. The potential to develop personalized treatments and prevention strategies for patients is now a possibility.

2. Your recently published research includes a series of collaborative papers that evaluate radiation therapy regimens in patients with early breast cancer and immunotherapy in breast cancer patients. Could you tell us more about that?

This research, in collaboration with radiation oncologists and medical oncologists, led to the identification of improvements in radiotherapy techniques that reduced the long term risks of heart and lung damage from radiotherapy with no loss of efficacy. In addition, in a series of treatment trials in breast cancer, we evaluated immunotherapy for treatment of breast cancer and incorporated the evaluation of immunologic and genetic markers, something that is now possible with the available high throughput technologies and computational platforms. In this research, we also developed some statistical approaches to allow us to to use systematic missing at random (SMAR) study designs that incorporate data from multiple domains using subsamples of the total study population to minimize costs.

3. Have you seen your research positively impact patients or the public in a way that has inspired you?

I was fortunate in my first position to be responsible for the analysis of the data that resulted from the Health Insurance Plan of Greater New York Breast Cancer Screening Study. We identified risk factors for breast cancer that have held the test of time and estimated the benefits of screening on mortality. This landmark study led to the widespread use of mammography for the early detection of breast cancer.

4. What types of emerging data are you most excited to explore, and in what ways do you anticipate these new types of data will improve health practice and policy?

The ability to design studies that incorporate data from multiple domains to develop integrated analyses that help us develop personalized treatments and to identify individuals at risk of disease is really exciting to me. To envision combining different types of data at the individual and group and population levels to develop new insights for policy and practice based new approaches to analysis at the interface of statistics and computing is the future.