Introduction to Vector Autoregression

Manish Kumar
Analytics Vidhya
Published in
5 min readJan 14, 2020

INTRODUCTION

Git hub -https://github.com/vatsmanish/Time-series-vector-autoregression-model

What you learn after this Article:-

· Why vector autoregression is important

· You will learn how vector autoregression work?

· Real life practical example of vector autoregression

· You will saw practical implementation of vector autoregression using world bank dataset and forcasting the GDP of countries.

First, what is Vector Autoregression (VAR) and when to use it?

Vector Autoregression (VAR) is a multivariate forecasting algorithm that is used when two or more time series influence each other.

That means, the basic requirements in order to use VAR are:

  1. You need atleast two time series (variables)
  2. The time series should influence each other.

Alright. So why is it called ‘Autoregressive’?

It is considered as an Autoregressive model because, each variable (Time Series) is modeled as a function of the past values, that is the predictors are nothing but the lags (time delayed value) of the series.

Ok, so how is VAR different from other Autoregressive models like AR, ARMA or ARIMA?

The primary difference is those models are uni-directional, where, the predictors influence the Y and not vice-versa. Whereas, Vector Auto Regression (VAR) is bi-directional. That is, the variables influence each other.

“A Vector autoregressive (VAR) model is useful when one is interested in predicting multiple time series variables using a single model. At its core, the VAR model is an extension of the univariate autoregressive model.”

· You will learn how vector autoregression work?

Let’s see some mathematical intuition behind this algoritham

This is for univariate data

As mentioned above in the univariate data Ai=1 means we have only one column, suppose I want to add more columns.

For that we will go with more number of multivariate data I.e:- we add more number of variables in our dataset.

Here I have two rows multiply with them to each other As autoregression does it takes e.g; leg=1(previous 1) here t= no of terms we want. Each rows multiplications are with their rows above shown with their products.

where α is the intercept, a constant and β1, β2 till βp are the coefficients of the lags of Y till order p

Here as we will increase the no of variable sconstant coefficients terms are incresing.

Real life practical example of vector autoregression

Let’s jump on next step:-

A big part of statistics, particularly for financial and econometric data, is analyzing time series, data that are autocorrelated over time. That is, one observation depends on previous observations and the order matters. Special care needs to be taken to account for this dependency. R has a number of built-in functions and packages to make working with time series easier

Here I have used World bank dataset that I have fetched from world bank API through R statistical tool used package “WDI” to fetch data.

Dataset description:- this dataset has GDP ,Per Caplital Growth, year, country name , country code.

Year- 1960 to 2018

Country — I have consider countries-United state,Canada, Australiya, china, india, Pakistan, Saudi arbia,

Country code= US , CA, AUS , CHN , IND , PAK, SAU

PerCapital Growth- per capital growth of every country

GdP-Every of each country

When dealing with multiple time series where each depends on its own past, others’ pasts and others’ presents, things get more complicated. The first thing we will do is convert all of the GDP data into a multivariate time series. To do this we first cast the data.frame to wide format and then call ts to convert it.

# for the indicator and more details -”?WDI

gdp <- WDI(country = c(“US”,”CA”,”AUS”,”CHN”,”IND”,”PAK”,”SAU”),

indicator = c(“NY.GDP.PCAP.CD”,”NY.GDP.MKTP.CD”),

start = 1960, end = 2018)

Headings the of dataset

country’s with their dataset

Now plot these data and see their inflation in years of PerCapitalGrowth

# plot them with lables and multiple format percapital growth

ggplot(gdp,aes(Year,PerCapGDP,color=Country,linetype=Country))+

geom_line()+scale_y_continuous(label=multiple_format(extra=dollar,multiple=”M”))

As you know percapital growth is depend on population of country . this plot shows that data’s are following a specific trend so, now I will put Here VAR model

Now let’s see GDP growth of these country

# plot the year wise gdp of bove mentioned country

ggplot(gdp,aes(Year,PerCapGDP,color=country,linetype=Country)+

geom_line()+scale_y_continuous(label=dollar))

As plot show us has best GDP and PerCapitalGRowth so plot it incividually.

build a model to forcast the gdp of every country that mentioned above.

Now as you can see Saudi Arbia has not GDP so now here i drop saudi arbia and make the data time series. for this write the code below

#convert first 10 rows since saudi arbia has not gdp on that time
gdpTs <- ts(data = gdpcast[, -1], start = min(gdpcast$Year),
end = max(gdpcast$Year))

this remove the suadi arbia and now plot them

growth of gdp after change into time series

names of all country

[1] "Australia"     "Canada"        "China"        
[4] "India" "Pakistan" "United.States"

As we discussed above we get the coefficient of every country

Each model has it's own coefficient i plot them separately(Australia)
Australia.l1 Canada.l1 China.l1
-1.697297 1.537982 10.509208
India.l1 Pakistan.l1 United.States.l1
27.605359 -30.358330 -1.469154
model for CanadaAustralia.l1 Canada.l1 China.l1
-1.7700709 0.8624928 7.3055663
India.l1 Pakistan.l1 United.States.l1
9.0913249 -19.1109224 -0.9543049
model for India
Australia.l1 Canada.l1 China.l1
-0.01773529 0.02896139 0.41307104
India.l1 Pakistan.l1 United.States.l1
-0.67389952 0.08271539 0.09754610...…..
https://github.com/vatsmanish/Time-series-vector-autoregression-modelnow you have coefficient so u can forecast the GDP of future.
Stay tuned for update!
Thank you

--

--