Tuning Parameters of Prophet for Forecasting: An Easy Approach in Python

Dr. Sandeep Singh Sandha, PhD
4 min readJan 2, 2023

--

Complete_Github_Notebook_to_Play. The tuned model gives optimal results on unseen future predictions.

Video: https://www.youtube.com/watch?v=tQ7w34IRNcE

The prophet is regarded as one of the state-of-the-art classical time-series forecasting models. In this post, we will see how to tune (using bayesian optimization: Mango) prophet’s parameters easily to get an optimal model. Prophet has many hyperparameters, making it very hard to find an optimal model without domain expertise. This is exactly what bayesian optimization is designed to solve. The default hyperparameter doesn’t always give the best results, even for very simple datasets.

For example, on a sample dataset of airpassangers that records the number of airline passengers, the predictions of default vs a tuned prophet model is shown below.

How to tune the model?

Dataset: airpassangers (record the number of airline passengers). We do a train-test split based on time-based and use the last 40 values for the test.

This series has a simple upward trend and seasonality.

Default Model Training

Install the necessary packages: pandas, numpy, prophet.

#!pip install prophet

import pandas as pd
import numpy as np

df = pd.read_csv('https://raw.githubusercontent.com/AileenNielsen/TimeSeriesAnalysisWithPython/master/data/AirPassengers.csv')

#transform dataset into the prophet input format
df['ds']=df['Month']
df['y'] = df['#Passengers']

Test_size = int(40) #last 40 values
train_df = df.head(len(df)-Test_size)
test_df = df.tail(Test_size)

from matplotlib import pyplot as plt

f = plt.figure()
f.set_figwidth(15)
f.set_figheight(6)

plt.plot(train_df['ds'], train_df['y'], linewidth = 4, label = "Train Series")
plt.plot(test_df['ds'], test_df['y'], linewidth = 4, label = "Test Series")


plt.legend(fontsize=25)
plt.ylabel('Value', fontsize = 25)
plt.xticks([])
plt.show()

#define loss function
def mape(y_true, y_pred):
return round(np.mean(np.abs((y_true - y_pred) / y_true)) * 100, 2)

#default model training
from prophet import Prophet
model = Prophet()
model.fit(train_df)
future = model.make_future_dataframe(periods=Test_size, freq='M')
forecast = model.predict(future)
predictions = forecast.tail(Test_size)

error = mape(test_df['y'], predictions['yhat'])
print('error is:', error)


from matplotlib import pyplot as plt

f = plt.figure()
f.set_figwidth(15)
f.set_figheight(6)

plt.plot(pd.to_datetime(test_df['ds']), test_df['y'], linewidth = 4, label = "Test GroundTruth")
plt.plot(pd.to_datetime(predictions['ds']), predictions['yhat'], linewidth = 4, label = "Predictions Default")

plt.legend(fontsize=25)
plt.ylabel('Value', fontsize = 25)
plt.show()

The default predictions serve as our baseline.

Hyperparameter Search Space

We define very large hyperparameter search space as below. This definition uses simple easy to understandable Python constructs.

from scipy.stats import uniform


param_space = dict(growth = ['linear', 'logistic', 'flat'],
n_changepoints = range(0, 55, 5),
changepoint_range = uniform(0.5, 0.5),
yearly_seasonality = [True, False],
weekly_seasonality = [True, False],
daily_seasonality = [True, False],
seasonality_mode = ['additive', 'multiplicative'],
seasonality_prior_scale=uniform(5.0, 15.0),
changepoint_prior_scale=uniform(0.0, 0.1),
interval_width = uniform(0.2, 0.8),
uncertainty_samples = [500, 1000, 1500, 2000]
)

Next, we use Mango, a state-of-the-art bayesian optimization library that can directly search complex search spaces for optimal parameters. In this case, the parameter search space is extremely large and complex, and thus we need an expert optimizer. Grid or random searches don’t work well in extremely large search spaces. We define the prophet model training function for a given set of parameters.

from prophet import Prophet
from mango import scheduler, Tuner


def objective_function(args_list):
global train_df, test_df

params_evaluated = []
results = []

for params in args_list:
try:
model = Prophet(**params)
model.fit(train_df)
future = model.make_future_dataframe(periods=Test_size, freq='M')
forecast = model.predict(future)
predictions_tuned = forecast.tail(Test_size)
error = mape(test_df['y'], predictions_tuned['yhat'])

params_evaluated.append(params)
results.append(error)
except:
#print(f"Exception raised for {params}")
#pass
params_evaluated.append(params)
results.append(25.0)# Giving high loss for exceptions regions of spaces

#print(params_evaluated, mse)
return params_evaluated, results

Finally, finding optimal parameters is very easy. We will use only 50+10 = 60 iterations of model training.

conf_Dict = dict()
conf_Dict['initial_random'] = 10
conf_Dict['num_iteration'] = 50

tuner = Tuner(param_space, objective_function, conf_Dict)
results = tuner.minimize()
print('best parameters:', results['best_params'])
print('best loss:', results['best_objective'])
best parameters: {'changepoint_prior_scale': 0.03645018575124749, 'changepoint_range': 0.5473968905424325, 'daily_seasonality': True, 'growth': 'linear', 'interval_width': 0.9544262800336518, 'n_changepoints': 45, 'seasonality_mode': 'multiplicative', 'seasonality_prior_scale': 16.83218918593137, 'uncertainty_samples': 1500, 'weekly_seasonality': True, 'yearly_seasonality': True}
best loss: 3.95

Seeing the tuned model results, we get the whole picture and the benefit of automatic tuning from an extremely large search space.

--

--