Forecasting an Intermittent Time Series

Use of tsintermittent in R

Michael Grogan
The Startup

--

One challenge that data scientists come across is forecasting an intermittent time series.

That is to say — a time series with many 0s present in the data.

An example of this is daily rainfall patterns. On days where there is no rainfall, a value of 0 is recorded. This makes for quite a volatile time series with no clearly defined trend and is much more difficult for a conventional time series model such as ARIMA to forecast.

The below data is sourced from the Irish weather broadcaster Met Éireann:

Source: Met Éireann

As we can see, forecasting a trend and seasonal patterns would prove quite tricky given that there are many 0 values present in the data at undefined intervals.

The conventional solution in this case might be to shorten the time series, e.g. add the rainfall in mm every 30 days in order to forecast a monthly time series. However, this would result in significant data loss and in the context of less than three years of data — any forecast may well prove to be quite superficial.

When working with a time series such as this, the tsintermittent package in R can come in quite handy.

In particular, Croston’s method is used on a training set for the 1,000 days of data…

--

--

Michael Grogan
The Startup

Statistical Data Scientist | Python and R trainer | Financial Writer | michael-grogan.com