Forecasting Made Easy with Facebook Prophet Model

Muhammad Danish Farooq
Virtual Force Inc.
Published in
13 min readSep 11, 2022

Get Hands on Experience of Facebook Prophet

This blog will not only familiarize you with Facebook Prophet & its use but tuning multiple time series models with additive approach of regressors / variables, and how cross-validation can be used to tune the model to potentially make it perform better.

What is Prophet

Prophet is an open source library developed by Facebook’s core Data science team, with the main purpose of the library being:

  1. Producing high quality forecasts for time series data which requires minimal tuning compared to other statistical models like ARIMA and less understanding of the underlying theory.
  2. Bridge the gap between people who have a decent understanding of the time series domain but less knowledge about the statistical theory required for making forecast models on time series data

Prophet is built using Stan which is a programming language highly optimized for statistical analysis/inference and written in C++. The foundation of the Prophet model is a Generalized Additive Model (which uses ‘back-fitting’ to get the best fit on the data), or GAM for short, and the Prophet model looks like this:

Figure 1: High level equation of prophet model

Figure1 figure gives a high level overview of the prophet model, and we can see it takes the sum of three models plus an error term to incorporate the time series data. This picture was taken from here.

The full mathematical details can be found in the original paper here.

Figure 2: Problem with automated forecast evaluation

Figure 2 figure was taken from the original paper, and it explains how the automated forecast evaluation could present problems to the analyst which they can visually inspect i.e the ‘analyst in the loop’ in the graph above and make adjustments to the model accordingly.

The dataset used in this blog contains weather data from multiple weather stations for different cities in China (made available by the University of California and is available here), and contains different measurements like temperature, SO2 concentration, whether it rained or not etc.

So let’s get started, this blog code is inspired by this amazing repository.

Preprocessing before fitting the model

First of all import the following necessary libraries required to do the magic.

Figure 3: Snippet of libraries to import for the project

If Prophet is not installed it can simply be installed by running the command pip install prophet, if there is still some issue in installing it is most likely due to some dependency not being met and further instructions can be found here

The original dataset contains 12 csv files each containing weather information for a different station/city, and can be observed below:

Figure 4: Outlay of data set files

These are the files contained in the original zip folder (available on the University of california webpage linked above), and we can see that the city/station name after ‘PRSA_Data_’, and also we can see that it contains data from 2013/03/01 -2017//02/28 which might be a bit harder to see at first

I choose 2 stations i.e Wanliu & Tiantan and concating both their dataframes to get one data frame we get:

Figure 5: Quick sneak peek into the data

As we can see there is a lot of information in this dataframe with the data given in an hourly format with lots of different measurements, and some NaN/empty values as well.

O3 represents the ozone and is our variable of interest and we want to see how it behaves with time, and logically temperature seems like a good regressor to include as well due to the extensive use of air conditioners etc. in the summer which could damage the ozone layer.

Converting the data into a daily format by taking the mean temperature of every day to get less amount of data to work with to make it easier for us we get:

Figure 6: Mean of everyday data was taken to reduce the size

df_weather represents the concatenated dataframe mentioned above, and as we can see from having ~ 70,000 data points previously we have only 2922.

Plotting the graphs of ozone and time with the first graph representing Tiantan we get:

Figure 7: Plot ozone concentration & temperature with time

Prophet requires the column with the timestamps be renamed to ‘ds’, and the outcome variable be renamed to ‘y’. As we can see ozone and temperature increase and decrease during the same time intervals so temperature seems like a good regressor variable, also another important thing to note is that the trend of this data does not seem to be growing/decreasing so when we make the Prophet model we do not need to change the parameter seasonality_mode to multiplicative as by default is additive and when we add the regressor variable to the model i.e temperature it’s mode will also be additive. An example where we would use seasonality_mode = ‘multiplicative’ is given below:

Figure 8: Changed the parameter seasonality_mode to multiplicative as by default is additive

So if we plot the time series data and get a graph which has a clear trend (clearly increasing or decreasing with time) like the one above then we would use seasonality_mode = ‘multiplicative’ when we instantiate the Prophet model or we can even use growth = ‘logistic’ and set a cap limit instead if we know that ‘y’ or the outcome variable is limited by a certain threshold which can be changed over time. An example would be that if the number of employees of a company kept increasing in the same trend as the graph above the threshold, as crazy as it sounds could be the population of the earth, or a more realistic threshold could be based on the number of people in a city or a country, revenue etc.

For allowing cross-validation to work optimally ( this comes later ) we want to remove all the NaN/empty values in the O3 and TEMP columns to get a better results and as we can see there were some values which were removed:

Figure 9: Remove the nan or empty values

Around 100 rows were removed which should allow us to perform cross-validation much more efficiently. If some of the rows have empty values in them for either the ‘y’ or the regressor variables i.e TEMP then when we try to find the best parameter using cross-validation we will get:

“WARNING:prophet.models:Optimization terminated abnormally. Falling back to Newton.”

and the whole process to get the best parameter value will take much longer, as the optimization algorithm i.e ‘L-BFGS’ which is the default was terminated.

Now before fitting the model, we need some elementary data structures to store information for each relevant station i.e mainly the temperature as it is the regressor we want to add to the model, and the data frame of the original station itself to help us visualize. We do not need to store the data frame of the original station if we do not want to visualize the results, and we can even use the df_weather data frame but just storing the data frames for the 2 stations made the code and the visualization easier

Figure 10: Separate data frames for clean coding and easy visualization

The main purpose of these data structures is to help us plot the results and also to store the temperature value. Since we are adding temperature as a regressor, we need to know its values relevant to each station, and since we do not have the values from 2017–02–28 onwards we will train the model from 2013–2016 and predict on 2017, whereas if we wanted to predict in the future i.e 2017–02–28 onwards upto 2018 using temperature we would need to estimate the temperature values or collect it from an additional source, or simply drop the use of temperature as a regressor.

Now the last thing before fitting the model i.e making the training set:

Figure 11: Create for_periods array to get number of days for prediction

The purpose of the for_periods array was to get the number of days to predict into the future for station Tiantan and Wanliu respectively.

Fitting the models

Now that we have the training and temperature/regressor set’s for both the stations we are ready to fit the model

Figure 12: Fitting the Tiantan data into Prophet model for prediction purposes
Figure 13: Fitting the Wanliu data into Prophet model for prediction purposes

We can see that both the models seemed to fit the data well and the prediction on the test set which is the first graph for both the Tiantan and wanliu figures were also pretty good, but we need to evaluate these models and not just judge by looking at plots.

Observe that if we had 100+ csv files (i.e 100+ stations) with weather data for multiple cities the problem becomes much more difficult and we cannot simply hard-code for every station. Suppose that we had more than 100+ csv files and we want to just predict the future(i.e dates beyond the last date in the files) without the addition of any regressor variable then we just have to get the relevant y/O3 values for each station for training after grouping by station and that’s it, if we wanted to add a regressor then we would need the future values of the regressor for the time-frame we are predicting for, if that is not possible and we want to add a regressor then we can only train on the data available to us, if we want to visualize the performance of each or some of the models on the training and test set (we would choose the size of the test set) then we would need to be able to attain the train and test data for the relevant station for which we can either store them if space is not an issue or we could construct the data frame for the relevant station when needed, and the code for this would be slightly more complex

Now onto tuning & testing the model using cross-validation

Evaluating using Cross-Validation & RMSE for tuning the model

Using the cross_validation function provided by the library we set the initial train period to be the time for 2 years, after which we want to make predictions every 30 days for the next 365 days, more detail regarding cross-validation can be found here. This is how cross-validation is done with time-series data using Facebook prophet.

Applying cross-validation using Prophet’s conveniently provided functions and using RMSE i.e Root mean squared error as the metric to compare to models we get the following graphs with the first graph representing Tiantan and second one representing the wanliu station. Note that other metrics like MAPE (mean absolute percentage error) and MSE and MAE are also provided and one or the other or all can be used but I chose to use the RMSE value

Figure 14: RMSE value for Tiantan station is 24.92
Figure 15: RMSE value for the wanliu model is 30.42

So it seems that the Tiantan station model fits the data slightly better as it has a lower RMSE value. There are other metrics that can be used besides RMSE and they can be seen as follows:

Figure 16: Some other validation metrics results

Now tuning the models using the code provided in the Prophet doc by Facebook and slightly modifying it to fit this example. Observe that this code uses only specific values for the parameters to perform cross-validation on including the default value as testing every single value would take a very long time unless the dataset is very small. The basic idea of the code below is to generate all combinations of values of changepoint_prior_scale and seasonality_prior_scale given in the param_grid dictionary and return the ones with the lowest RMSE and as mentioned previously RMSE does not necessarily need to be used

Figure 17: Performing grid search for validation

We get 2 parameters to modify for each model using the code provided by Facebook i.e changepoint_prior_scale and seasonality_prior_scale which are both parameters that can be tuned when making the Prophet model, and it works by giving us the changepoint_prior_scale and seasonality_prior_scale with the lowest RMSE value. Note that holiday_prior_scale can also be added to the code above and it has the same range as seasonality_prior_scale i.e 0.1–10 (explained below) but we did not pass any data frame for the holiday effect which is why it is not needed here.

Changepoint prior scale refers to how much a trend changes at ‘trend changepoints’, and trend changepoints refer to points where there are abrupt changes in the time series data, and Prophet by default detects these changepoints for us with more detail regarding how it does this here. Trend changepoints can also be visualized e.g below are changepoints added to the C02 dataset present in the statsmodel library, using the conveniently provided add_changepoints_to_plot method present in Prophet

Figure 18: Changepoints added to the C02 dataset present in the statsmodel library

By default Prophet plots changepoint on 80% of the data and this can be changed. We can see that the lines which represent the changepoints are where the graph abruptly goes either up or down i.e the trend changes.

It is essentially a regularization parameter and like most regularization parameters a low value will cause the trend to underfit the data whereas a high value will overfit. The default value for this is 0.05, and the range for this parameter to be tested is 0.001–0.5 recommended by Facebook and is found here, and this page also contains the range for seasonality_prior_scale and holiday_prior_scale

Seasonality prior scale controls the flexibility of the seasonality. The default value of this is 10 which basically applies no regularization at all, and more details regarding this can be found here.

We get the best parameters for the 2 models (i.e lowest RMSE) in regards to changepoint_prior_scale and seasonlity_prior_scale and so let’s make new model’s with the addition of these parameters and everything else constant to see how it performs compared to the previous models

Figure 19: Results of the forecast on the Tiantan station with the old model
Figure 20: Results of the forecast on the Tiantan station with the new model

They look almost identical but evaluation of the new model is required to see if they are the same

Figure 21: Results of the forecast on the Wanliu station with the old model
Figure 22: Results of the forecast on the Wanliu station with the newmodel

The new model seems to be overfitting the data as it’s changepoint_prior_scale value was 0.5 i.e the highest it could be given the range in the official Facebook documentation which is much higher than the default value, and allows much more trend flexibility even though the actual prediction (dark blue line) of both models seems to be fairly identical. Performing evaluation on the new models and comparing with the old models we get:

Figure 23: Results of cross-validation on the Tiantan station with the old model
Figure 24: Results of cross-validation on the Tiantan station with the new model

Observing the results for the Tiantan station, we can see that the RMSE value increased very slightly (almost negligible) and the standard deviation decreased

Figure 25: Results of cross-validation on the Wanliu station with the old model
Figure 26: Results of cross-validation on the Wanliu station with the new model

We can see that the RMSE decreased from 30.42 to ~ 27.5 and standard deviation decreased from 11.45 to ~ 6.24 implying that the model with these new parameters performed better.

Please follow the link for my colab notebook to access the code for this blog.

The new model stayed pretty much the same for the Tiantan station with less standard deviation so it makes sense to use the new model over the old one for the Tiantan station. Although the RMSE and standard deviation improved for the Wanliu station the large prediction interval for the Wanliu station with the new model indicates that this model is overfitting the data for values roughly after 2016–08 (when the prediction interval gets very large), so depending on the context we would decide which model to use. If we are not predicting that far into the future relative to the training set for the Wanliu station then we can use the new model and for predicting further into the future we can use the old model.

References

  1. Taylor SJ, Letham B. 2017. Forecasting at scale. PeerJ Preprints 5:e3190v2 https://doi.org/10.7287/peerj.preprints.3190v2
  2. Robson, Winston A, “Intro to Facebook Prophet” Medium, Future Vision, July 09, 2019, Intro to Facebook Prophet. Walk-thru Example & Repo — Everything… | by Winston Robson | Future Vision | Medium
  3. Srivatsan Srinivasan, “Multiple_Time_Series_using_Prophet.ipynb”, Aug 22, 2020, Github repository
  4. Moto DEI, “Facebook Prophet”, Aug 23, 2020. Facebook Prophet. (Almost) everything you should know to… | by Moto DEI | The Startup | Medium

--

--