Finding Exoplanets with Deep Learning

Jason Terry
7 min readSep 29, 2022

--

Jason Terry

GSoC 2022

ML4Sci

Introduction

Since the first exoplanet was identified in 1992, there has been an explosion in the number and variety of exoplanets found. Several different detection methods have found success in locating exoplanets in formed stellar systems.

Detecting forming exoplanets is a challenge. These young objects are embedded deep within a protoplanetary disk: the site of planet formation. The layers of dust and gas between the planet and the surface of the disk makes it very difficult to peer in and directly observe the planet. The observation of young or forming exoplanets is a crucial pillar of the development of our theories of planet formation, so it is important that we develop the capabilities to consistently do so.

A protoplanetary disks with gaps cleared by planets: HL Tau (ALMA/NRAO/ESO/NAOJ)

Previous work in this realm has largely focused on the morphological effects of the planet on the disk, e.g. planets may clear gaps in the disk that can be observed. While these methods have found success, there are a variety of confounding physical processes that may create gaps as well. Magnetic fields, for instance, are capable of doing so.

A different approach is to look at the effects of the planet on the motion of the gas and dust within the disk. As a planet plows through the disk, the material’s motion is disturbed. By observing the disk in line emission (often using an isotopologue of carbon monoxide), we can quantify how a given region of the disk is moving (“velocity channel”). This is the basis of kinematic analysis.

The most simple case of motion is Keplerian motion. In this regime, the only influences on the orbits of the gas and dust is the distance from the star and the stellar mass itself. Only gravity from the central star is at work. Any deviations from Keplerian motion are indicative of the presence of some other process or body in the disk.

Simulations and observations have shown that planets should leave a characteristic deviation from Keplerian motion. This signature is in the form of a localized “kink” in the line emission. The presence of such a kink, especially one coincident with a gap, is therefore strongly suggestive of a planet.

Top row: Some velocity channels of disk in Keplerian motion. Bottom row: Disk with planet (circled in white).

Pinte et al. (2018, 2019) were able to detect this signature in two different disks.

Kinematic planet detection in HD 97048 by Pinte et al. (2019) Figure 1

This success has spurred further kinematic analysis and observation of protoplanetary disks. Identifying these kinks can be difficult though. They can be very subtle and easily masked or mistaken for noise. Painstakingly inspecting each channel for a slight signal is slow and tedious. This creates the strong possibility of overlooking planets. New observatories and surveys are going to produce a wealth of data soon, so effectively, quickly, and accurately analyzing these observations is of the utmost importance.

Project

Developing a tool capable of performing that task is the goal of this project. Specifically, the goal is to develop deep learning models that are capable of accurately deciding if a given observed disk hosts at least one planet and, if so, give a rough indication as to where the location(s). Such a tool would allow the automation of the analysis of observations and focus the human efforts.

Data

There is insufficient observational data to train models. Not only do we not have enough observations, but we also don’t know the truth about most of them. We must therefore rely on simulations to create our data.

The simulations come in two steps: the evolution of the disk and the creation of the synthetic observations. To simulate the disk evolution, we use PHANTOM, a smoothed particle hydrodynamics (SPH) code. We simulate 1,000 disks under a variety of physical and observational conditions (e.g. the mass of the star, the distance to the disk, the viewing angle, etc.). A disk may have anywhere from 0 to 4 planets.

Example disk simulations. The disk on the left has no planets while the disk on the right has three.

Once the disk structure has been simulated, the synthetic observations are made with MCFOST, a radiative transfer code. This creates the velocity channels maps. We then convolve the results and add noise to faithfully replicate current observational capabilities.

The velocity channels before and after convolution and noise addition. The white circle is the beam size of the observation.

80% of the data is used for training with 20% of that being used for validation.

Models

Given the image-based nature of observations, computer vision models were the obvious choice. There are some crucial differences between using the velocity channels and a normal RGB of grayscale image.

A given observation may have over 50 velocity channels that cover the disk. Analyzing each velocity channel individually is not desirable. The vast majority, even in disks with planets, will not include a planet. Furthermore, the local nature of an individual analysis throws away the majority of the physical information from the observation.

It is therefore best to analyze the entire disk at once. Doing so gives a model as much information as possible without any ambiguity as to whether an input has a planet. This somewhat complicates matter though. Typical computer vision architectures are designed to take in RGB or grayscale images (i.e. 3 and 1 input channels, respectively). We need to input dozens of channels. This does not change the qualitative nature of the algorithms, but it does require tweaking architectures and generalizing the concept of an image to one with an arbitrary number of input channels.

We choose two classes of model to use: EfficientNetV2 and RegNet. Our implementation is based off of PyTorch’s versions. For each class of model, we train a network with 47, 61, and 75 input channels, giving us a total of six models.

We train using an Adam optimizer that has a scheduled learning rate. Binary cross entropy is our loss function. Early stopping is allowed based off of validation loss.

In addition to changing the number of input channels, we also do a hyperparameter sweep using Weights & Biases.

W&B hyperparameter sweep for RegNet with 61 input channels

Results

Of the six models trained, five of them resulted in an AUC of >0.95. Even when a confidence of over 95% (as measured by softmax activation) is demanded, four of the models have an accuracy of over 90%.

Performance metrics for final models
Training metrics for all models. Early stopping was allowed.

Importantly, the results for the five best models were independent of the number of planets within the disk.

The explicit training goal for these models was simply to properly classify the disks. However, some insight into the location of the expected planets is extremely useful. We look to the internal activation structure of the model for this. Strongly activated regions are typically important when the model makes its classification decision, which could suggest that the planet — or at least the planet’s signature — is at that location.

Inspecting the activations shows that is sometimes the case. There are systems in which the signatures are extremely hard to detect by eye. The model not only confidently identifies them as hosting planets, but it also activates strongly on the regions and/or velocity channels containing the signature.

Example activation structure for a disk with four planets. (a): Mean-subtracted activations. (b): Disk structure with planets in white. (c): Perfectly resolved velocity channel showing planet’s signature circled in white. (d): Velocity channel after observational effects are added, resulting in an almost completely obscured signature.

Reality is the most important test for these models. We have shown that they work on synthetic observations, but we need to know if it works on actual observations.

We turn to HD 97048 for this test. There is one confirmed planet in this disk that has been observed with kinematic data, so our model must be able to locate it.

We apply all six of our models to the same data used in Pinte et al. (2019). All models predict a planet with a confidence of over 60%, and two models have confidences over 99%. The activation structure indicates that the planet is exactly where Pinte et al. suggested it should be.

Results for HD 97048. (a): HD 97048 observed at 885 microns. (b): The channel with the planet’s signature as given by Pinte et al. (2019). (c): The mean-subtracted activation for an early layer. (d): The mean-subtracted activation for a deeper layer.

Conclusion

We have shown that kinematic data of protoplanetary disks can effectively be used to determine if the disk contains a planet and give some insight into the location of the planet.

Applying our models to actual observations supports previous conclusions.

With further refinements and added capabilities, we anticipate that this work will provide the base for a future pipeline from observations to information about the presence and qualities of any planets.

Use this Jupyter Notebook for further investigation.

--

--