Hunting Exoplanets in Kepler Data — Introducing the lightkurve package

Nuno Ramos Carvalho
4 min readMar 16, 2020

--

Exoplanets are planets that live outside our Solar System. This post briefly introduces processing Kepler data using the lightkurve Python package, to study exoplanets.

Image credit: Gemini Observatory/NSF/AURA/Artwork by Joy Pollard.

Introduction

The transit method is currently one of the most popular indirect methods used to discover exoplanets. In a nutshell, it consists in analysing the brightness variation of the parent star when the planet crosses the light path of the object with respect to the observer. The now retired Kepler telescope was launched by NASA to detect planets orbiting other stars and is responsible for helping in detecting most of the exoplanets known today.

A planet transit causing the star brightness to dim, image credit: https://imgur.com/43f17Ke.

Lightkurve is a Python package for Kepler & TESS time series analysis, providing a user-friendly way to analyse astronomical flux time series data. The remaining of this post illustrates how to use this package to study the light curve for Kepler-427 b. We start by importing the package:

import lightkurve as lk

Kepler-427 b

Kepler-427 b is a gas giant orbiting a G-type start, with mass of 0.29 Jupiters. The first step to analyze this systems’ Kepler observations, is to retrieve the data for the system. We could retrieve the raw pixel files (more on this in a future post) but to keep it simple we choose to get the light curve file, already built by the Kepler pipeline. To download the light curve file, we simply call the package search_lightcurve function, giving as argument the target system, and the quarter that we are interested in (Kepler observations are divided in quarters):

lcf = lk.search_lightcurvefile('kepler-427', quarter=6).download()

This downloads the required data files, and from the lcf object we can retrieve many artifacts, including two light curves: for the Simple Aperture Photometry (SAP) flux, and for the Pre-search Data Conditioning PDCSAP flux (that includes some corrections for star activity and instrumental errors), both light curves are readily available. Using the the following code we can retrieve and visualize the PDCSAP curve:

# retrieve the PDCSAP light curve from the file
lc = lcf.PDCSAP_FLUX
# plot the light curve
lc.plot()

The lc object now stores the PDCSAP light curve, and the following figure illustrates the corresponding plot:

Kepler-427 PDCSAP light curve for Quarter 6.

By eyeballing the plot we can identify a repeating dim in the flux, a possible exoplanet transit. Next, we create a normalized periodogram (useful for characterizing periodicity in unevenly sampled time series) that will help identifying the signals period with the to_periodogram function using the Box Least Squares (BLS) method:

# create the periodogram from the light curve
pg = lc.to_periodogram(method = 'bls')
# plot the periodogram
pg.plot()

The following figure illustrates the periodogram plot:

Kepler-427 PDCSAP light curve periodogram plot.

From a visual analysis of the plot, we identify a strong periodic signal, with a period of a little over 10 days. We can ask the periodogram pg object for the period at the maximum frequency:

print('Period at max power: {:.3f}'.format(pg.period_at_max_power))
# Period at max power: 10.297 d

Now that we have a pretty good idea of the period of the peak of our transit, we can fold all the observations over this period, i.e. overlap all the observations of the transit, using the fold function from the light curve object lc, given the period:

# fold the curve given the period
fold = lc.fold(period = pg.period_at_max_power)
# plot the fold
fold.plot()

The following image illustrates the folded light curve:

Kepler-427 folded light curve for the inferred period.

And there we go, we can no proceed in characterising this exoplanet orbit around the hosting star, give that we have a pretty good idea of the period and can also estimate the estimates of planet diameter from the depth of dim of the star flux. Or maybe fitting a model to the observed data.

The lightkurve package provides a lot of useful functions to quickly process Kepler data, the full documentation is available from its’ homepage. More useful and interesting functionalities are available, this post is just a brief introduction, we’ll discuss more complex operations in future posts.

A complete notebook with the code and illustrated results for this post is available here.

--

--