Machine Learning for Detecting Gravitational Waves

Anya Kondamani
NYU Data Science Review
4 min readFeb 19, 2024
Illustration of a star passing a supermassive blackhole and experiencing the gravitational redshift predicted by Einstein’s theory of general relativity. Image by Nicole R. Fuller/NSF

The first discussion of gravitational waves in physics was introduced in 1912 by Albert Einstein’s theory of general relativity, which suggested that interruptions in space-time, such as the collision of a black hole, would release waves from the source of the event. According to his own mathematical model and several others, Einstein’s idea proved to be extremely plausible. The only qualm was that these waves were extremely faint, making them impossible for any researcher to detect. For years following Einstein’s discovery, researchers worked towards the possibility of detecting these waves, making serious advances in the 1960s.

In 1972, the first functional interferometric [1] gravitational wave detector was developed by Rainer Weiss, acting as the model for future detectors. Still far from perfection, prototypes and testing persisted, attempting to increase the sensitivity of the detectors. In 1990, the National Science Board approved construction of Laser Interferometer Gravitational-Wave Observatory (LIGO), a project managed by Caltech and MIT (LIGO Caltech Timeline). This project relies on massive interferometers, devices that merge 2 or more beams of light to create an interference, located in Hanford, Washington, in Livingston, Louisiana, and most recently, in Pisa, Italy. These interferometers split a laser beam and send the resulting beams down two perpendicular arms, each several kilometers long. Passing gravitational waves cause minute changes in the length of these arms, confirming the waves’ existence.

Schematic depicting how the interferometer functions in the presence of a gravitational wave. Image by LIGO.

In order to successfully capture the presence of passing gravitational waves, these detectors require an intense level of sensitivity. Unfortunately, this proves to be a bit problematic because the detectors are not able to distinguish between waves and other trivial space events, resulting in a number of glitches that contaminate the data sets. Currently, the only detection algorithms used in LIGO data cleansing “belong to the matched-filtering category”, meaning they only focus on a portion of the dimensional space at hand (Apostol, 2023). If these datasets were small enough to analyze by hand, then perhaps these glitches in the data would be easily identified and removed. However with more than 200,000 auxiliary channels in each detector, analyzing by hand is practically impossible. Unfortunately, computers are not as insightful as researchers when it comes to recognizing potential errors or data skews attributed to activity outside of the waves. Thus, success in the search for gravitational waves requires the ability to prevent such glitches from contaminating the wave data.

Enter machine learning, offering the brilliant art of classification, a key step to sifting through the noise and identifying the true gravitational wave signals. This requires an algorithm that can challenge the interference of a variety of factors, from a passing truck that creates faint vibrations to scattered light that intercepts the laser beam of the interferometer. More importantly, the algorithm must “ensure that the event is indeed a glitch of instrumental origin, rather than an unmodeled [gravitational wave] event” (Colgan et al., 2020). Due to this fact, and the ever-increasing list of potential interference and the subtlety of these glitches, this algorithm isn’t as clear-cut as one would hope. Instead, it must possess a deep understanding of the complex relationship between the interferometer and the data it collects.

The method at hand is rather collaborative, employing multiple subprocesses for peak performance. A popular technique for glitch detection is logistic regression, best explained by Colgan et al. as a method of restricting the output in the data between 0 and 1. Another is the use of Deep Convolutional Neural Networks (CNNs), “designed to extract features from 2D matrices, such as images, and use these features for classification purposes” (Cuoco, 2020). Other popular implementations include Omicron, a software that effectively calculates signal to noise ratio in the data, and Wavelet Detection Filter, a Python package that filters signal data.

These techniques, some tailored or improved specifically for LIGO data and others pre-existing, extend beyond astrophysics to find broader applications in data-intensive work. Although the concept of classification sounds all too simple, that’s far from the truth in the case of LIGO, and the process of effectively cleansing data can be the one thing that stands in the way of success. The methods employed in identifying non-trivial signals in noisy datasets can be applied to studies regarding environmental monitoring, financial models, and medical research.

All-in-all, this example of successful data cleansing in a large-scale project like LIGO sheds light on the endless possibilities in not just astrophysics but countless other disciplines. With advanced applications of machine learning and statistics, researchers are no longer hindered by the difficulty that comes with data collection. In the past few years, LIGO has come very far in its space discovery via analysis of gravitational waves, and as methods are adjusted, it’s safe to say performance will only increase.

[1]: Interferometric implies the method of interference, in this specific case, utilizing light beams.

References:

  1. Apostol, Elena-Simona and Ciprian-Octavian Truică. (2023). Efficient Machine Learning Ensemble Methods for Detecting Gravitational Wave Glitches in LIGO Time Series. arXivLabs. https://arxiv.org/abs/2311.02106
  2. Colgan, Robert E., K. Rainer Corley, Yenson Lau, Imre Bartos, John N. Wright, Zsuzsa Márka, and Szabolcs Márka. (2020). Efficient gravitational-wave glitch identification from environmental data through machine learning. PHYSICAL REVIEW D 101, 102003. https://journals.aps.org/prd/pdf/10.1103/PhysRevD.101.102003
  3. Cuoco, Elena, Jade Powell, Marco Cavaglià, Kendall Ackley, Michał Bejger, Chayan Chatterjee, Michael Coughlin, Scott Coughlin, Paul Easter, Reed Essick, Hunter Gabbard, Timothy Gebhard, Shaon Ghosh, Leïla Haegel, Alberto Iess, David Keitel, Zsuzsa Márka, Szabolcs Márka, Filip Morawski, Tri Nguyen, Rich Ormiston, Michael Pürrer, Massimiliano Razzano, Kai Staats, Gabriele Vajente, and Daniel Williams. (2020). Enhancing gravitational-wave science with machine learning. Machine Learning: Science and Technology, Volume 2, Number 1. https://iopscience.iop.org/article/10.1088/2632-2153/abb93a#mlstabb93as2

--

--

Anya Kondamani
NYU Data Science Review

Data Science at New York University I NYU Data Science Club