Auto Regressive Distributed Lag (ARDL) time series forecasting model explained

The mathematical explanation for ARDL with example

Mehul Gupta
Data Science in your pocket

--

Photo by Aron Visuals on Unsplash

After discussing a few time-series forecasting models in the past, I will be talking about some rarely explored Time Series models starting with ARDL i.e. Autoregressive Distributed Lag (that’s quite a name).

My debut book “LangChain in your Pocket” is out now !!

Do check out the blog explanation below

Before jumping in, we must know what are

Endogenous variables (dependent variable): Variables that are dependent on other variables in the eco-system.

Exogenous variables (independent variables): Independent variables whose values are more or less decided by factors outside the eco-system. Endogenous variable may depend on Exogenous variable

Resuming back to ARDL

As the name suggests, ARDL is an extension of AR models (Auto Regression)

You remember Auto Regression. Right? I have got you covered

So, unlike AR which is totally dependent on self-lagged values, ARDL has got 4 components that are summed together to get to the final forecast

Self lagged values

Distributed lags

Seasonality component

Trend component

Forecast = Self-lag + Distributed-lag + Seasonality + Trend

Let’s understand each term starting with Trend

Trend Component = e + x0 + x1*t + x2*t² ….. xk*tᵏ

What is ‘k’ & ‘t’ in the above equation?

Trend in a time series can be linear, quadratic (ax² + bx + c) or of some higher degree (non linear equation). So, in the above equation , ‘k’ is the degree of trend Let’s see what would be the trend component given we have linear, quadratic or some higher degree trend in the time series

Linear = e+x0 + x1*t

Quadratic = e+x0 + x1*t + x2*t²

Kth degree = e+x0 + x1*t + x2*t² ….. xk*tᵏ

where t = [1,2,3,4….k]

Seasonality component = ΣᵢXᵢ Sᵢ

Seasonality is modeled using seasonal dummy variables in ARDL.

What are seasonal dummy variables?

These are a set of variables that can be created to depict seasonality. For example: If we have quarterly seasonality (every 3 months) in our dataset which has data for 5 years, then to model seasonality, we can create 4 variables, Q1, Q2, Q3, and Q4 & using the one-hot encoding technique, will set one of the variables as one, rest as 0 at any given timestamp. So, if the timestamp is 15th Jan, Q1=1 rest others as 0. Hence [1,0,0,0] . Similarly [0,0,1,0] for timestamp=15th Aug. Similarly, we can have 2 variables to depict 6 monthly seasonality (1st half, 2nd half). The idea is to have such a set of variables that got 1 for just one variable & 0 for others at any timestamp t

For more on seasonal dummies, refer here

The Seasonality component is the weighted sum of Seasonal dummy variables given the timestamp=t (from Yt). If you look closely, at any timestamp t, no two dummy variables can have a value other than 0. So assume we wish to forecast Y for 2nd May’22. Hence t=2/05/2022. As t falls in Q2, the seasonal component equals

ΣᵢXᵢ Sᵢ = X₀S₀ + X₁S₁ + X₂S₂ + X₃S₃

= 0 + X₁ + 0 + 0 = X₁

The rest of the terms become 0 as the respective dummy variables were 0 except S₁.

Xᵢ are the coefficients associated with each seasonal dummy variable

All the terms feel alright (at least you have heard them ever) except distributed lags?

Distributed lags

Distributed lag is nothing but the weighted sum of lagged versions of exogenous variables in the system. So, If we have X as a dependent/endogenous variable, Y& Z as exogenous variables on which X is dependent.

An example can best understand distributed lags component. Assume we have 2 exogenous variables Y & Z with lag orders 2 & 1 respectively. Then, the distributed lags component can take the below form

Distributed Lags (order 2,1)= a1*Yt-1+ a2*Yt-2+b1*Zt-1

One thing to notice is for each exogenous variable, we will have a separate lag order defined. So, we may be considering a lag of order 2 for variable Y & lag of order 1 for variable Z for forecasting some endogenous variable X.

Auto Regression

Assuming you have read about AR after my heads up, AR is nothing but a linear component with self-lagged versions. Below is an example of AR of order 2 for forecasting Xt,

Auto Regression (order 2) = m1*Xt-1 + m2*Xt-2

Let’s combine everything & forecast X at timestamp ‘t’ such that

  • The linear trend is present in endogenous variable X
  • Seasonality is Quarterly
  • We have 2 exogenous variables Y,Z with order 2 & 1
  • Auto Regression lag order 2

Xt = (e + x0 + x1*t) + (k₀*S₀ + k₁*S₁ + k₂*S₂ + k₃*S₃) +

(a1*Yt-1+ a2*Yt-2+b1*Zt-1) + (m1*Xt-1 + m2*Xt-2)

Where

  • e is error term
  • k0,k1,k2,a1,a2,b1,m1,m2 are coefficients of respective terms associated
  • t-1 & t-2 represent last timestamp & 2nd last timestamp observed. So Yt-1 becomes Y value observed on timestamp t-1.

Before leaving,

does ARDL look similar to VAR (Vector Auto Regression)?

Do you remember VAR? do check it out here

So in VAR modeling, which itself is an extension of AR modelling, a variable X is forecasted using combination of

  1. Self lagged features (AR)
  2. Other variables lagged feature on which this variable X is dependent

So is ARDL & VAR the same? Not at all

VAR & ARDL differ on 3 major points

  • In VAR, every exogenous variable considered has the same lag order & not like ARDL where we have different lag orders for different variables.
  • Trend & Seasonality components aren’t considered.
  • In VAR, we have different equations for predicting different terms. So an equation for forecasting Xt differs from that forecasting Xt+1 !!

By different equations, we mean the coefficients used in the linear equation will change for every variable forecasted

So, for example, we have X as the dependent variable (endogenous variable); Y & Z as independent variables (exogenous variables). Assume we need to forecast Xt, Xt+1 using VAR of order 1

Xt = e1 + a11*Xt-1 + b11*Yt-1 + c11*Zt-1

Xt+1 =e2 + a21*Xt + b21*Yt + c21*Zt

Here, the coefficients used for 1st lag terms are different in the two-equation for the same order VAR.

That’s it, will resume with some more rarely known forecasting methods in my next

--

--