Multivariate Time Series using VAR

Mehul Gupta
Data Science in your pocket
3 min readJun 27, 2019

--

In my past articles(Check out the links below), I have provided some solutions regarding univariate Time Series i.e. where no other factor except the series is considered while forecasting but in case of Multivariate Time Series, many other factors affecting the values are considered( as in real life examples) as given below. Here each factor will be affecting the forecasting of other factors!!! How to deal with it, we will explore that

Courtesy Analytics Vidhya

Considering the above data, I would be forecasting it for Temperature using VAR

Vector AutoRegression (VAR) can be taken as an extension of AR(as explained here). There would be some changes. But first of all, consider the AR of Order 1

Temperature_3= a + b*Temperature_2 + e

Order 2

Temperature_3= a + b*Temperature_2 + c*Temperature_1 + e

where a,b,c are constant, e is white noise term(explained here)

For Vector AutoRegression, the equation will incorporate all existing features, that is Humidity, Cloud Cover, Dew Point and Wind. The equation for order 1 VAR becomes

3rd term

Temperature_3= a_T2 + b_T2*Humidity_2 + c_T2*CloudCover_2 + d_T2*DewPoint_2 + e_T2*Wind_2 + f_T2*Temperature_2 + e

4th term

Temperature_4= a_T3 + b_T3*Humidity_3 + c_T3*CloudCover_3 + d_T3*DewPoint_3 + e_T3*Wind_3 + f_T3*Temperature_3 + e

where all a,b,c,d,e and f terms are constants.

Confusions?????

  • How would I get Humidity_3 or CloudCover_3 if data till ‘_2’ (2nd row given and we need to predict 4th which would need 3rd-row terms from other factors as well)?

What is done is that we would be considering these factors until the last known value. Like if we are forecasting 5th term of temperature & only 3 rows are given, other factors incorporated terms would be 3rd, i.e. Humidity_3, Wind_3, CloudCover_3, etc. even for predicting Temperature_5, Temperature_6 and so on

The below picture would give you a better picture

Add description

Here,

  • y1,y2….,yk is are the forecasted temperature values(Its temperature for us)
  • a1,a2….,ak are constants
  • The K X K matrix provides coefficients for lagged versions of different features/factors used for forecasting along with the lagged version of the forecasted variable as well. The number of K X K matrix needed is equal to the Order of VAR(one for each lag/shift).
  • y1(t-1),y2(t-1),… are the lagged values from different features used in forecasting

Note- As you can see, for forecasting any term for a given order, coefficients are different for each lagged feature value and no same coefficient used.

  • e1,e2,e3….,ek are white noise terms

Also, as you must have guessed by the time, every variable forecasted has a different equation in VAR i.e. if we wish to forecast Y3 with order 1 using VAR, the coefficients used for lagged versions will be different when we would be forecasting Y4 with order 1.

Y3 = x1 + a1*Y2 + b1*X2…

Y4 = x2 + a2*Y3 + B2*X3…

Hence, even though the order is 1, we use different coefficients for lagged features used (a1~=a2)

Multivariate Time Series solutions can be handled in a number of ways using some extension from AR & MA models like VMA, VARMA, VARIMA, etc. Do explore!!!

--

--