DATA STORIES | TIME SERIES ANALYSIS | KNIME ANALYTICS PLATFORM

KNIME & fbProphet: Time Series Forecasting with a few clicks

Build a sophisticated time series forecast with a few clicks using a component in KNIME with Facebook Prophet

Dennis Ganzaroli
Low Code for Data Science

--

Fig. 1: Time Series Analysis with KNIME & Prophet (image by author).

Time series analysis can be very demanding and sometimes you just want to press a button instead of putting too much time and effort into setting up the analysis.

The Facebook Prophet (fbProphet) library is the solution to our problem and we want to implement it as a component in KNIME so that we only have to adjust a few settings and the whole time series analysis is done automatically.

But why is a time series analysis usually so laborious?

You have probably already heard that a time series must be stationary.
This means that the mean value and the variance of the time series must be constant.
These assumptions can be easily violated in time series by the presence of a trend, seasonality, and other time-dependent structures.

Fig 2: stationary vs. non-stationary time series (image by author).

For example: The plot of the air passenger data below shows that the time series is not stationary. There is an obvious trend in the data and a repeating seasonal effect.

Fig 3: Number of monthly airline passengers in thousands from 1949 to 1960 (image from author).

In order to remove the deterministic components, we have to decompose our time series into separate stationary and deterministic components.
The general mathematical representation of the decomposition approach is:

Y(t) = T(t) + S(t) + e(t)

where…

  • Y(t): is the time series value (actual data) at period t
  • T(t): is a deterministic trend-cycle or general movement component
  • S(t): is a deterministic seasonal component
  • e(t): is the irregular (remainder or residual) (stationary) component.
Fig 4: Time series with trend and seasonality component (image by author).

By removing the trend and the seasonality we get a time series composed from the residuals. We have essentially subtracted trend and seasonality from the original signal. The resulting data is called the residual and the resulting time series is now stationary.

Fig 5: Original time series and residual after decomposing (image by author).

Now we could go to the next step and try to model the residuals and then create a predictive model.

We have already covered this example of classical Time Series analysis in the following article:

But wouldn’t it be awesome to have everything created by an automated solution?

The Facebook Prophet algorithm

fbProphet, also simply called Prophet, is a forecasting algorithm developed by Facebook’s data science team in 2017. The algorithm is designed to be scalable, fast, and accurate, making it suitable for a wide range of applications, from predicting sales in e-commerce to forecasting weather patterns.

The core idea behind Prophet is to model time series data as a combination of trend, seasonality, regressors, and noise components.
By decomposing the data into these components, the algorithm can generate accurate forecasts that capture the underlying patterns in the data.

Y(t) = T(t) + S(t) + h(t) + e(t)

where…

  • Y(t): is the time series value (actual data) at period t
  • T(t): is a trend component and is modelled using a piecewise linear regression model
  • S(t): is a seasonal component such as weekly or monthly trends and is modelled using the Fourier series allowing for flexible modelling of different seasonal patterns.
  • h(t): is or are additional regressors that may influence time series, such as holidays, weather patterns, or marketing campaigns
  • e(t): is the irregular (remainder or residual) (stationary) component. The noise component captures the random fluctuations in the data that cannot be explained by the trend or seasonality components.

Prophet uses a Bayesian framework to model the time series data. This means that the algorithm estimates the posterior distribution of the model parameters, rather than just point estimates. By doing so, the algorithm can generate probabilistic forecasts that provide a measure of uncertainty around the point forecast.

Installing the Prophet library in KNIME

The Python integration in KNIME makes it possible to run Python scripts seamlessly in KNIME workflows and thus combine the best of visual programming and scripting in one tool.

If you still don’t know KNIME, you can find here a good “Getting Started Guide” where you can also download the open-source software for free.

The visual programming language of KNIME is self-explanatory and therefore easy to learn.

Fig. 6: Getting Set Up with KNIME Analytics Platform (image from KNIME).

To install the Prophet library in KNIME we just have to open the workflow in my KNIME Community Hub space or drag and drop the component in your KNIME workflow editor.

Fig. 9: KNIME Workflow with fbProphet example in my KNIME Hub (image from KNIME).

We will not explain in detail how the component was built and how the Python script was written within the component. That would go beyond the scope of this article.

For this time, we will content ourselves with just using the component so that we can quickly perform our time series analysis. But just out of curiosity, let’s take a look inside the component:

The component contains various configuration nodes on the left hand side, which are used for the input screen. There is also a Python script node in which the time series predictions are created using the Prophet library.

Fig. 10: Interior view of the fbprophet component (image from KNIME).

Time Series Analysis in KNIME with Prophet

Now let’s repeat the example with the time series of passengers from above, but this time with the new fbprophet component node in which I have included the Prophet Python library.

The KNIME workflows with all the examples can be found on my
KNIME Community Hub space.

By double clicking the fbprophet component node an input mask appears.
For the “Date Column” we choose “Month” and for the “Data column” the “#Passengers”.

We would also like to predict the next 12 months. Therefore we choose a “Forecast Period” of 12 and a “Frequency” in months = “M”.

Important: The date column must be in string format!

Fig 11: Input mask for fbprophet Component (image by author).

After executing the KNIME workflow we get a quite good fit of the time series and also a forecast for the next 12 month.

Fig 12: Forecasting air passengers with fbprophet (image from author).

In addition in the lower part of the chart, the time series is decomposed in the components trend and seasonality.

Fig 13: Components and Goodness of fit of time series (image from author).

We see that the number of passengers increases every year and that there is a peak in the months during the summer vacations.

The goodness of fit of the model is also shown in the table at the bottom right.

Let’s try another example:

On Kaggle you can download a daily weather-climate time series of the city of Delhi. After importing it in KNIME we execute again our fbprophet node and choose “meantemp” as data column.

The following result is quite immediate and shows how well our implementation is working.

Fig 14: Daily climate data in the city of Delhi from 2013 to 2017 (image from author).

Let’s now explore how to use Prophet with additional regressors. We will use the same temperature data as before but this time include the effect of humidity, wind pressure & mean pressure in the model.

Simply click on the “Use regressor” button and move the desired column to the righthand panel.

Fig 15: Enabling the regressor columns in Prophet (image from author).

The new model has now a better goodness of fit with a lower MAE (mean absolute error), from MAE = 1.511 to MAE = 1.201.

The Extra regressors are included in the linear component of the model and displayed in the components diagram.

Fig 16: Model Daily climate data in the city of Delhi with extra regressors (image from author).

Conclusion

We have seen how easy it is to create sophisticated time series analyses more or less automatically by breaking them down into a seasonality and trend component and even adding external regressors.

On the one hand, the data scientist can easily build time series analyses and, on the other hand, the developer can add further features over the component.

That is the power of KNIME. Getting them all on the same working platform!

Thanks for reading and may the Data Force be with you!
Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn or Twitter and follow my Facebook Group “Data Science with Yodime”.

Material for this project:

References:
-
Prophet: Forecasting at Scale

--

--

Dennis Ganzaroli
Low Code for Data Science

Data Scientist with over 20 years of experience. Degree in Psychology and Computer Science. KNIME COTM 2021 and Winner of KNIME Best blog post 2020.