Since Sean Taylor and Ben Letham open-sourced Prophet in 2017, it has remained a popular tool for forecasting time series, especially in business and planning contexts where we want to model human activity and consumption (e.g. website traffic, video hours watched). To January 2023, the Python package has been downloaded over 16 million times via PyPI, and continues to see 1 million downloads per month. However, long-time users would have realised that releases have slowed down over the last few years, and in this post we’ll walk through our future plans for the package and how we envision it fitting with the rest of the forecasting ecosystem.
Future work on Prophet will primarily focus on:
- Ease of installation and integration with wrapper packages (e.g. sktime) and analytics services.
- Quality of life changes for applying Prophet in practice, including runtime improvements.
- Clearer documentation on parameter selection, case studies, and performance benchmarks for different types of time series.
We do not plan to make large changes to the underlying model. Prophet’s advantages remain the same — an analyst-in-the-loop forecaster with human-interpretable parameters, easily applied to business analytics use cases — but we encourage anyone looking for cutting edge innovation in forecasting to try packages such as NeuralProphet and Nixtla/statsforecast instead.
Seamless installation and integration
In late 2021, Zillow was under fire from the data science corner of Twitter for their supposed reliance on Prophet in predicting house prices (predictions that informed buy/sell decisions for Zillow Offers, an initiative that lost the company $300m and was shut down). A memorable quote for me (from this thoughtful blog post) was
Prophet is very convenient because it’s hard to beat pip install prophet into from prophet import Prophet into Prophet().fit(df)
— if someone had actually managed to get those commands working on their first try, I would have been impressed!
pip install prophet actually kicked off a build process on the end user’s machine, that required PyStan (Stan is the probabilistic programming languge used to fit the underlying model) and a compatible C++ toolchain (e.g. Xcode for macOS). This caused a lot of confusion around pre-requisites, and error messages that differed from machine to machine so were hard to debug — overall, a terrible installation experience for your average data scientist.
Wrapper packages that depended on Prophet also had to pin to older versions of PyStan and Prophet, and it wasn’t straightforward to test whether it was safe to upgrade to newer versions.
With the help of the open source community, we invested in building wheels (binary distributions) for PyPI, across all platforms, so that users could actually just run
pip install prophet. There were a few steps to this:
- We swapped our backend from PyStan to CmdStanPy, a lightweight Python interface to the core Stan library (called CmdStan). Once built, Stan executables can be used as-is, so long as there’s a linked math library called TBB. The idea was that if we save these executables, TBB, and Prophet’s compiled Stan code inside the package, an end user could run
.fit()without needing to compile any C++ code.
- We extended the build commands from
setuptoolsto enable the above, adding the following steps to the build process: download and install CmdStan, compile the Prophet model, then prune the 1GB CmdStan folder to less than 20MB of necessary executables and the TBB library. It took a few iterations to arrive at a comprehensive (i.e. covering all operating systems) but succinct implementation, and we thank the Stan development team, particularly Brian Ward, for their guidance here.
- Finally, we used
cibuildwheeland Github Actions to run the build process on CI machines, covering all major operating systems and Python versions.
Having binary distributions uploaded to PyPI immediately had a positive impact on the installation experience. Below is a chart of Prophet’s Github issues over time, categorised by the type of question asked. On average we’ve seen 8 issues related to Python installation every 2 months, and this halved after the release of Prophet v1.1.
We’ll continue to evolve our build process in line with modern Python packaging standards: more recently, we have followed Bokeh’s
Kats , and analytics services such as Snowflake and Databricks.
Polishing existing applications, rather than introducing new methods
The original Prophet model was designed with a few core components: a piecewise linear trend with changepoints, multiple seasonality, and exogenous regressors, with tuneable parameters — in the form of prior distributions — that allow the end user to tweak the relative strength of each component. This generalised well to most time series generated by human behaviour: shocks and non-stationarity could be captured by the flexible trend component, day-of-week and day-of-year seasonality could be captured by combining multiple fourier series, and holiday effects could be captured as binary regressors. Since the original release, a couple of major changes helped cover a larger range of time series patterns: multiplicative (instead of additive) seasonality and regressor components, and a “flat” trend that allows seasonality and regressors to dictate the forecast.
There have been ideas for further enhancements to the model, but at this moment we do not plan to make any more changes to the underlying Stan model. Many of these ideas are likely to be useful — such as allowing different error distributions for
y, or different prior distributions for model parameters — but there are packages like NeuralProphet and Orbit that were designed with this type of extensibility in mind, unlike Prophet. We also foresee innovations to time series forecasting occurring via these new packages rather than Prophet.
Instead, we’ll focus on ironing out the kinks in existing functionality, which should help those currently using Prophet in a real-world setting. Some examples include:
- Keeping holiday data up-to-date by merging our custom data with the
holidayspackage, which will also mean better coverage of countries and subdivisions as newer versions of
- Correctly handling extra regressors during cross-validation and uncertainty estimation. When generating predictions, extra regressors are currently assumed to be known quantities, which is valid for holiday indicators but not quantities that need to be estimated.
- Patching negative predictions for positive-valued time series.
Runtime optimisations can also help greatly in a production pipeline, and it’s common to fit separate models and generate predictions over different cuts of data. A big feature in the v1.1.1 release was vectorizing the simulations used to calculate uncertainty intervals, which sped up
.predict() by at least an order of magnitude. This was all thanks to Oren Matar’s initial research. In a future release we’ll introduce NumPyro as another probabilistic programming language backend (thanks to Freddy Boulton who translated the Stan code), which shows considerably faster
.fit() speeds for datasets with longer histories, especially when MCMC sampling is required.
Education and documentation
Questions about troubleshooting forecasts, understanding how different parameters influence the model, and adding custom seasonalities / regressors, are the most common type of Github issue, making up 50% of reported issues in the last six months.
Although we already have foundational documentation, there’s an opportunity to extend these with more case studies (we recently added one for handling systematic changes caused by pandemic lockdowns), and dive into more detail about how to represent prior beliefs about the time series through parameter selections. Prophet is after all meant to be used with an “analyst in the loop”, and one of its advantages over other algorithms is having interpretable, tuneable parameters (see the original paper).
Some benchmarking studies (microprediction, Nixtla/statsforecast) have also shown that Prophet’s default settings lead to poor forecast accuracy for time series with certain characteristics. We really like this research and think it’s important to provide clarity in our own documentation on where Prophet shines (in terms of accuracy) and where other techniques perform better. There are existing efforts to standardise the infrastructure (e.g.
sktime) and datasets (e.g. https://forecastingdata.org/) for time series benchmarking, and we’ll contribute to these initiatives with the ultimate aim of guiding end users to the right tool for their use case.
If you’re a regular Prophet user and would like to get involved in any of the above, feel free to dip your toes into our Github Issues list — the focus areas discussed above would only require changes to pure Python code or Jupyter notebooks, so should be accessible to most data scientists / engineers. If you’re an R user, we’d love your help too! We’ve spent most of our time on the Python package due to its popularity, but there’s now a backlog of functionality that can be added to the R package.
— Cuong Duong, Ben Letham, Sean Taylor