Facebook Prophet

(Almost) everything you should know to use Facebook Prophet like a pro, with many example Python codes👨🏿‍💻 and cheat sheet🧾

Moto DEI
The Startup
7 min readAug 22, 2020

--

What is Prophet?

“Prophet” is an open-sourced library available on R or Python which helps users analyze and forecast time-series values released in 2017. With developers’ great efforts to make the time-series data analysis be available without expert works, it is highly user-friendly but still highly customizable, even to non-expert users. How lovely!!

Facebook Prophet official logo

In this article, starting from default model run, I tried to summarize any available tuning options, particularly useful ones, to provide better prediction, although it may not be literally everything because there are so many customizable options in Prophet! I also gave some Python example codes and cheat sheet-like exhibits.

Photo by Jake Hills on Unsplash

Table of Contents:

- Quick Start Code (in Python) with Default Option Setting

- Prophet Options Cheat Sheets And Use Examples

  • Uncertainty Options / Trend Options / Holiday Options
  • Seasonality Options
  • Adding Regressors / Model Diagnostics

- Background Math of Prophet

- What Prophet Does Not Do

  • Prophet does not allow non-Gaussian noise distribution (at the moment)
  • Prophet does not take autocorrelation on residual into account
  • Prophet does not assume stochastic trend

- End Note

Quick Start Code (in Python) with Default Option Setting

Prophet can handle;

  • trend with its changepoints,
  • seasonality (yearly, weekly, daily, and other user-defined seasonality),
  • holiday effect, and
  • input regressors

as model components, and there’s also uncertainty options to control the prediction uncertainty interval.

Here’s what Prophet default functions provide to each of the components.

Default option setup of Prophet

Below is a quick-start Python code, with default setups.

You may find everything is prepared to be user-friendly without any special care about the time-series data handling. Once you are familiar with basic Python data modeling using sklearn APIs, Prophet code should also look similar.

Data used in the exercise throughout this post is the data of log-transformed daily page views of the Wikipedia page for Peyton Manning, an American Football player, prepared and distributed by Prophet team.

Simplest quick start code of Prophet

What I like here particularly is “make_future_dateframe” function because making a dataset for future prediction in time-series analysis is usually unpleasant moment because it requires datetime handling. Here with Prophet, just giving the length of future period will provide you the necessary dataframe.

Here’s the set of output plots I got from the code.

Default code output plot
Dataframe ‘forecast’ with many predicted components

Prophet Options Cheat Sheets And Use Examples

Uncertainty Options / Trend Options / Holiday Options

There are options to control uncertainty, trend (type (or mode), changepoint, and visualization), and holiday effect (country or user-input). Here’s a summary:

Uncertainty Options / Trend Options / Holiday Options

Also, here’s a Python code example with the use of some of the options.

Example using trend options and holiday options.

You can see the plot now has the trade changepoints information, and the trade follow the logistic curve having floor and cap, although I don’t think it is reasonable to apply logistic trend for the data after log-transformation. See the component plots now also show the holiday effect.

Results of the code

Seasonality Options

There are a lot of options in Prophet to control seasonality. Yearly, weekly, and daily seasonality and their granularity; mode of seasonality (additive/multiplicative); user-defined seasonality including conditional seasonality.

Seasonality Options

Here’s an example using conditional weekly seasonality.

You can find the on-season weekly seasonality and off-season weekly seasonality are also plotted (and look very different, which indicates they worth splitting.)

Result of the code

Adding Regressors / Model Diagnostics

Prophet also allow to input regressors (or explanatory variables, or features). Just adding columns to input data and future data and tell the model about them using ‘add_regressor’.

Illustration of how “Rolling Origin” cross validation works (https://www.researchgate.net/figure/Forecast-on-a-rolling-origin-cross-validation_fig1_326835034); blue=training set, orange=validation set

Last but not the least, Prophet has many useful functionality to do model diagnostics, cross-validation in a way of “rolling origin” (see picture on the left), and output of performance metrics.

Adding Regressors / Model Diagnostics

Here’s an example using cross-validation option.

Here are what we get from the codes. 6 different types of metrics are shown by each time horizon, but by taking moving average over 37 days in this case (can be changed by ‘rolling_window’ option).

The metrics can be also plotted so that you can check visually how things change over the time horizons.

Results of the code

Background Math of Prophet

Math in Prophet is well-discussed in their paper “Forecasting at Scale” or other Medium articles.

Based on “Forecasting at Scale” and their model in the Prophet module, the main formula of the model is described as follows:

, where

Respectively,

Trend portion
Seasonality portion
Holiday effect portion

I will not talk too much about the details of the formula here, just recommend reading their paper “Forecasting at Scale” once for more details.

Any parameters are inferred using MCMC simulated on Stan — MAP estimate (Newton method or L-BFGS) or sampling depending on ‘mcmc_samples’ option.

What Prophet Does Not Do

Prophet does not allow non-Gaussian noise distribution (at the moment)

In Prophet, noise distribution is always Gaussian and pre-transformation of y values is the only way to handle the values following skewed distribution.

This is a topic actively discussed in one of issues of the Prophet GitHub repository here and possible code customization to allow Poisson and Negative Binomial distribution in case the target value is a count data was given in the discussion.

Prophet does not take autocorrelation on residual into account

Since epsilon noise portion in the formula assume i.i.d. normal distribution, the residual is not assumed to have autocorrelation, unlike ARIMA model.

Actually, when we plot the ACF and PACF after the fit of Peyton Manning data, we will see clear AR(1) tendency — exponentially decaying ACF, high PACF at t=1 and close to zero PACF at t≥2.

And, when I created a new data frame having lagged value and tested to add it as a regressor just like manually prepared AR(1) model, the ACF and PACF indicated the white noise’s ones, although this approach is not implemented in the Prophet therefore unable to give future prediction in a regular use of the Prophet functions.

‘y_lag’ is to represent y value in prior time stamp.
Adding y_lag as regressor looks giving WN residuals.

This topic is discussed in one of the issues of the Prophet GitHub repository here. An interesting idea from Ben Letham about MA(1) case was to use the prior time point’s residual for a regressor of next time point value. Since we do not know the true value of residual until we fit the true model, the estimation would be iterative, something like boosting. Again, in this approach the future prediction can not be given by the regular use of Prophet functions.

Prophet does not assume stochastic trend

Prophet’s trend component is always deterministic+possible changepoints and it won’t assume stochastic trend unlike ARIMA. See this web page for the discussion of ‘stochastic trend vs. deterministic trend’.

Usually, we do unit root tests to know if the data is stationary or trend stationary. When rejected, we do differencing the data until we know the data is stationary, which also give stochastic trend component. Using deterministic trend (without changepoints) underestimates the uncertainty compared to stochastic trend, although Prophet looks using changepoints components and its future uncertainty to cover up that underestimate.

End Note

Prophet has high usability with many customizable options to handle most of the necessary extensions to model the time-series data. It is well-modularized as one package so that users can enjoy them without embarrassing exposure to the math of the model.

Model itself is based on simple building blocks of separate components of the effects. Those effects are estimated by MCMC on Stan. This simplicity gives high visibility to each effect and should provide a great basis of discussion between experts and non-experts, although it somewhat sacrifices some of time-series modeling considerations, which are beyond the ‘building block’ approach, such as autocorrelation or stochastic trend.

--

--

Moto DEI
The Startup

Principal Engineer/Data Scientist and Actuary with 20 yrs exp in media, marketing, insurance, and healthcare. https://www.linkedin.com/in/moto-dei-358abaa/