Analytics Vidhya
Published in
6 min readJun 18, 2021

A time series forecasting series.

Holt-Winters forecasting is a way to model and predict the behavior of a sequence of values over time — a time series. Holt-Winters is one of the most popular forecasting techniques for time series.

It’s decades old, but it’s still ubiquitous in many applications, including monitoring, where it’s used for purposes such as anomaly detection and capacity planning.

Holt-Winters is a model of time series behavior. Forecasting always requires a model, and Holt-Winters is a way to model three aspects of the time series: a typical value (average), a slope (trend) over time, and a cyclical repeating pattern (seasonality). Holt-Winters uses exponential smoothing to encode lots of values from the past and use them to predict “typical” values for the present and future. If you’re not familiar with exponential smoothing.

Holt-Winters Method

There are two variations to this method that differ like the seasonal component.

The additive method is preferred when the seasonal variations are roughly constant through the series, while the multiplicative method is preferred when the seasonal variations are changing proportionally to the level of the series

There are three components.

  • Trend
  • Seasonality
  • Level

Additive Seasonality

l = level, b = trend, s = seasonality, m = seasonal time period, h = forecast time period(how many time periods)

Multiplicative Seasonality

l = level, b = trend, s = seasonality, m = seasonal time period, h = forecast time period(how many time periods)

Holt-Winters — Important concepts


  • Simple Moving Average
  • Weighted Moving Average
  • Exponential Weighted Moving Average or Exponential Smoothing

Simple Moving Average

example of SMA
Where, K=2
SMA = (15+20)/(2) = 17.5
SMA = (20+25)/(2) = 22.5


Where if, K=3
SMA = (15+20+25)/(3) = 20
SMA = (20+25+30)/(3) = 25

Weighted Moving Average

Adding weightage to the Kth Variable.
WMA for K=2
WMA = (20*2+15*1)/(3) = 18.33
WMA = (16*2+20*1)/(3) = 17.33
WMA for K=3
WMA = (16*3+20*2+15*1)/(6) = 17.16
WMA = (13*3+16*2+20*1)/(6) = 15.16

Exponential Smoothing

  • Averaging over long periods dampens fluctuations, removing not only the noise but also trend and seasonality
  • Moving averages over short recent periods maintain trend and seasonality but determining an optimum number for periods is tricky, even when using metrics like MAE.
  • If averaged over too few periods, irregularities continue to remain, and if averaged over long periods, dampening again becomes a problem.
  • Exponential smoothing retains all older periods while giving greater weight to more recent periods (hence not a MOVING average).

Stationary Model: Exponential Smoothing

ŷt+1 = Predicted value for time period t+1
ŷt = Predicted value for previous period
α = any value between 0 and 1 (0 ≤ α ≤ 1)
(yt-ŷt) = adjustment for the error made in predicting the previous periods's value
  • Three Weights smoothing parameters used to update components at each period.

For Additive smoothing components

For Multiplicative seasonality

Now, Let’s Skip the math and understand how Holt winter works

Holt-Winter by hand.

Here is the R code:
x <- seq(1,20,1)
for (i in x) {
x[i] = 0
if ((i-2)%%5==0){
plot.ts(x,ylab="Value", main="Plot of simple time series")

The pattern is obvious: the plot repeats the values [0, 1, 0, 0, 0].

What would it look like if we made the values relative to the average of those 5 points? The average of (0+1+0+0+0)/5 is 0.2, which we’ll draw on the plot as a horizontal line:

Recall that Holt-Winters has a trend component. If we set its parameter to zero, Holt-Winters ignores the trend (slope), so the model simplifies. Now, it’s just a bunch of values relative to the average. In our plot, the values relative to 0.2 are [-0.2, 0.8, -0.2, -0.2, -0.2]. If we did Holt-Winters without trend, that’s the type of model we’d build.

Here’s what the HoltWinters function in R gives, with some annotations in blue that I added manually:

Forecasting with trend is just an enhancement of this. Instead of using a fixed average as the foundation, you just have to incorporate the slope of the line. Here’s a model that has a trend:

the example series repeats itself every five points, i.e. the season is 5 periods.

The right seasonality is crucial to Holt-Winters forecasting

To illustrate this, let’s see what happens when you use a season of 6 periods, one greater than the actual season of 5 periods:

The forecast, which is the red line in the chart, becomes less accurate and turns into garbage. To get good results, you need to give the model good parameters. This is the second challenge with Holt-Winters forecasting.

Picking the seasonality is a hard problem. General-purpose forecasting is hard because it has to be ready to use on any dataset, which might have any combination of values, trends, and seasonality. It might not even have some of those components

the accuracy of a forecast is to calculate the differences between the predicted values and the actual values. The blue arrows in the following chart represent how far off the prediction was from the actual value.

To quantify overall accuracy, you can combine these differences into a single value by taking the average or the sum of squared values.

The result is a value that is smaller if the forecast is better, and larger if the forecast is worse. This gives you a good way to compare forecast results.

Our forecasting code tries lots of combinations with different parameters and picks the ones that generate the lowest combined error score.

To illustrate this, here are a bunch of forecasts on the same time series, trying out different frequencies

The one with the right seasonality (5 periods per season) is easy to pick out visually because the differences between the data and the forecast are small. This is a visual example of what our forecasting does through optimization. It also optimizes other parameters, such as the trend.


Holt-Winters forecasting is surprisingly powerful despite its simplicity. It can handle many complicated seasonal patterns by simply finding the central value and then adding in the effects of slope and seasonality. The trick is giving it the right parameters. This solution is simple to build and understand, which is valuable for our purposes.