Post-Pandemic Flu Forecasting
By Guzal Bulatova
Predicting flu patterns is essential for business planning of consumer healthcare companies, like Haleon. By accurately forecasting upcoming flu seasons, Haleon can ensure that we have enough over-the-counter flu medications, pain relievers, and cough medications in stock to meet demand and better serve our customers.
The grey swan that was the COVID-19 pandemic affected the whole planet: the way we meet, work, shop. This has had an impact on many different time-series including production, transportation and sales to name a few.
Flu incidence patterns have been affected as well, showing significant deviations in trend, cycle and, most prominently, seasonal patterns during 2020–2022. Even though the patterns have now recovered back to pre-pandemic rhythm, we have this highly irregular historical data that must be accounted for when forecasts are created. Dealing with sporadic outliers can be troublesome, and here we have an anomaly period that lasted approximately 3 years.
In this article we’re taking a closer look at an affected time series — FluID’s Influenza-Like Illness (ILI) incidence in the US [1]. We’re assessing different approaches for dealing with COVID-19 period and we want to compare these with a benchmark approach, which is to leave the pandemic-affected period as is. We’ll study their effects on forecast accuracy and summarise our observations.
Flu time series decomposition
The dataset in focus represents reported ILI cases in US (country-wide), weekly, between January 2012 until September 2023, population-adjusted (per 100 thousand people). Even without decomposition analysis the seasonal patterns are prominent with clear peaks in the first two months of every year during the pre-pandemic period. The same peaks do not occur at the start of 2021. Instead they can be seen at the end of 2021 following lockdown and travel policy updates.
Below is the graph with the same time series seasonal plot, where each year is on the same x-axis representing the number of the week in the respective year. The pre-pandemic 2017–2019 (orange) have similar pattern, with peaks in the beginning and end of the year. Pandemic years (purple, blue and light blue), each follow a unique pattern that isn’t like any other year. 2023 in red is following a unique pattern, however, a bit more similar to pre-pandemic than other years:
Let’s check the randomness in our dataset. For that we’ll use Autocorrelation function (ACF) graphs, which depict correlations at varying time lags. The correlation values are between -1 and 1, where -1 is strong negative correlation, 1 is strong positive correlation and 0 is no correlation.
Below we have ACF plots, where autocorrelations are computed for number of ILI cases at lags 0 to 52. The ACF for the whole time period looks quite promising: here correlation values are significantly non-zero for many time-lag separations signifying that data is non-random. Lags 1–8 are > 0.5:
However, if we look at the time periods separately we can see in the ACF for pre-pandemic period a more regular, sinusoidal pattern with not only positive, but also negative peaks repeating within yearly period:
Adding 2023 to the pre-pandemic period flattens the negative peak, but it is still present and the positive peaks stay largely unaffected:
And separate 2020–2022 shows very different picture, with absolute majority of values near zero for 33 out of 52 time-lag separations, ascertaining significantly higher randomness than in the periods before and after:
Decomposing the 20–22 period with Seasonal and Trend decomposition using Loess (STL) we take a closer look at key time series components each representing an underlying pattern category:
- Seasonal — representing effects of seasonal factors such as the time of the year,
- Trend — long-term increase or decrease in the data and
- Residuals — containing anything else in the time series.
On the right graph with decomposed pandemic period we see the seasonal component absent and the residuals values within the range [-20, 8], although again before 2020 (on the left) it was quite strong, with residuals fluctuating within range [-10, 10]:
We are going to assess the forecasts for 2023 in three scenarios:
- using the data “as is” with basic normalisation only,
- excluding the 2020–2022 period,
- smoothing the pandemic period.
For our purposes of assessing general effects on forecast quality and for simplicity, we’re making point forecasts using “statistical” models: seasonal Naïve, Holt-Winters Exponential Smoothing and Prophet.
Approach 1: Include the pandemic period
Simply replacing outliers without thinking about why they have occurred is a dangerous practice. They may provide useful information about the process that produced the data, and which should be taken into account when forecasting [2].
Another reason for including the pandemic period, besides using it as a benchmark, is that it is straightforward. The less we change the data the truer a reflection of the real world phenomenon it is.
All forecasts are for 39 steps (weeks) ahead, out-of-sample.
We are splitting the data into training and testing sets using sliding window approach. Let’s say we have a set of 100 observations. An example of sliding window of training size 10, testing size 5 and step = 1:
- first split: 10 points [1, 10] as training set and 5 points [11, 15] as testing set
- second split: we move one step forward, and use [2, 11] to train and [12, 16] to test,
- and so on, until the last split, where [86, 95] is our training and [96, 100] is our test.
Here’s a schematic representation:
To have baseline for the three models we did backtesting — checking how models predict before 2020. We have generated 75 sliding window splits with step=1 to assess the models’ performance in general on non-random, highly seasonal period that was until pandemic started:
We calculate the Mean Absolute Error (MAE) and Root-Mean-Square Error (RMSE) for each forecast and then average over the 75 splits. Holt-Winters Exponential Smoothing (Holt-Winters ETS) is performing best with 3.7 MAE, followed by Seasonal Naïve (sNaïve) and Prophet. Given the values of the forecasted variable are within [0, 112], the absolute errors of 3.7–6.1 are quite low:
For testing our three approaches our focus is on the last 39 weeks of the whole set, i.e., the weeks of 2023.
Now we use the whole dataset. To test the first approach, leaving the pandemic period in, we have generated, again, 75 splits with step=1 and forecasting horizon of 39.
Considering the observed values of ILI cases during the preceding period, we are able to generate rather accurate forecasts for 2023 without any additional adjustments:
We observe a slight increase in errors compared to backtesting period. sNaïve performing best at 6.8 MAE / 9.25 RMSE on the 2023 period, followed by Holt-Winters ETS and Prophet.
Approach 2: Exclude the pandemic period
On the other hand, if we’re assuming that the outliers will not occur in the future and genuinely are errors, then we can modify the data to account for it, the most radical approach being excluding the 2020–2022 period altogether. This is assuming “the world is back to pre-pandemic normal” scenario.
Excluding the pandemic, however, doesn’t improve the accuracy of the predictions. If we compare the forecast plots we can observe the very prominent peak in the first months of the year, that the models learned from the pre-pandemic training set. See February 2023 on the plot above: all three modes predict a peak whereas actual values (purple) did have a small peak there, but generally were following a declining trend.
The best model again here is seasonal Naive with a MAE of 11.7, which is nearly twice as high as the 6.8 which was observed in the training set including the pandemic period. Doesn’t seem that it’s a good idea to simply ignore the COVID-19 period.
Approach 3: Adjust the data
Given the extent of the period, we’re applying a variance reduction and MinMax normalisation to the values. The variance is reduced by splitting set into yearly subsets and multiplying each element in the subset by (100*1/mean(subset): x * 100 / mean(subset) for x in subset.
Although the outlier years are still noticeable, their extent has been significantly reduced.
This approach proves to be better than the complete exclusion of 2020–2022 period, and for Holt-Winters ETS and Prophet this is their peak performance within our scope. However, these forecasts don’t beat the MAE 6.8 set by sNaïve on simple population adjusted data:
Conclusion
In this exercise we looked into the influence of inclusion, exclusion and adjustment for COVID-19 affected period for forecasting Influenza-Like Illness incidence post-pandemic. The hypothesis was that addressing the period will improve the forecasting accuracy even with the standard time series models like ETS, but we were proved to be wrong.
While the difference was not tremendous, still, the best forecasts we achieved were with the simplest approach and the simplest method.
Philosophically, when dealing with outliers, we first need to check our base assumption: whether the process(es) causing these deviations are going to reoccur and/or are affecting the future. We tested the assumption that they didn’t affect the future, and, given our observations, we’re more likely to be wrong than right. That is sensible in hindsight, as the way people interact with each other has changed: people are working from home, choosing to stay in when they feel sick, they socialise less. The COVID-19 pandemic might not be reoccurring now, but it is still affecting the flu patterns.
When generalising this conclusion to similarly affected datasets the best option is still, of course, to check the data.
Future work:
In the scope of this exercise we haven’t studied the data using machine learning models and we are not claiming that they will show the same results. It is possible that an ML model trained on pre-pandemic data alone could forecast post-pandemic period better than the one trained on the whole set. This could be an interesting experiment to conduct.