From Micro to Macro: Kyle Cranmer talks Machine Learning for the Natural Sciences

CDS’s affiliated prof and particle physicist Kyle Cranmer explains how machine learning can help scientific research in biology, chemistry, and physics

NYU Center for Data Science
Center for Data Science
3 min readSep 26, 2017

--

Scientific research often shifts between two different scales: the microscopic and the macroscopic.

Of the two, the latter scale is perhaps the most challenging.

For example, in epidemiology (the study of how diseases spread), the microscopic narrative is relatively clear.

Someone coughs on you, or you touch a surface with a virus on it, and then you either get sick or not. You carry the disease; you spread the disease as you go about your day; then you recover, or you don’t.

But how can we extrapolate this microscopic picture to the macroscopic scale? How can we study the way disease spreads in a city, or across whole populations?

Today, scientists typically turn to simulations to study how disease spreads, but it is a challenge to extrapolate these simulations into a broader theory. The complex interactions at the microscopic scale give rise to emergent behavior at the macroscopic scale. The same is true for many areas of science.

Kyle Cranmer

Yet, as CDS professor and particle physicist Kyle Cranmer explains in his recent research, new directions in machine learning are starting to change this.

“The power of machine learning is the ability to efficiently summarize and generalize from many examples,” Cranmer explains. “If that data is coming from a simulation, machine learning can help us extrapolate from the microscopic picture of how things work, to an effective description of how the big system behaves.”

Contrast this with language, health care, and the social sciences — areas that receive a great deal of attention from the machine learning community. These are primarily data-driven as the mechanistic, causal narrative for what is going on is far from clear.

Cranmer thinks that machine learning experts might have great impact if they apply their skills fields already have well developed simulations.

“Ironically,” he continues, “I think most people think that machine learning can solve problems in fields like language, social science, or health care first, and may eventually solve the problems in chemistry or physics. But I think that it is quite the opposite.”

What we need, Cranmer says, is a parallel effort in machine learning communities to focus on problems that are simulation-driven as well as those that are purely data-driven — and his work at CERN’s Large Hadron Collider (LHC) already involves doing precisely that.

Combining particle physics simulations with machine learning, his team is working to get the most out of the 50,000,000 gigabytes of data that the LHC produces per year in order to test out theories about the fundamental forces of the universe.

To learn more, check out Cranmer’s presentation here, or see the link to his NIPS workshop titled “Deep Learning for the Physical Sciences.”

by Cherrie Kwok

--

--

NYU Center for Data Science
Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.