Understanding Durbin-Watson Test

Analyttica Datalab
3 min readAug 4, 2021

--

The Durbin Watson (DW) statistic is used as a test for checking auto correlation in the residuals of a statistical regression analysis. If auto correlation exists, it undervalues the standard error and may cause us to believe that predictors are significant when in reality they are not.

Autocorrelation can be of nature positive or negative. A stock having positive autocorrelation would mean that if the stock fell on previous day, it is also likely that it will fall today. A stock that has a negative autocorrelation, would mean that if it fell on previous day, there is a greater likelihood it will rise today.
The Durbin Watson test looks for a specific type of serial correlation i.e. first order correlation (the lag is 1 unit). The Hypotheses for the Durbin Watson test are:

H0 = first order autocorrelation does Not exist.
H1 = first order correlation exists.

DW test statistic d is:

Where,

et- are residuals from OLS regression.

et-1 are first order differences of residuals.

The DW statistic d lies between 0 and 4.
d = 2 means no autocorrelation.
0 ‘ d < 2 means positive autocorrelation
2 < d ‘ 4 means negative autocorrelation

A general rule followed is: DW test statistic values in the range of 1.5 to 2.5 are relatively acceptable. Values outside of this range could be a cause for worry. Values under 1 or more than 3 are a definite cause for worry.

Application:

One of the assumptions of OLS regression is that error terms are not correlated. DW test is used to confirm whether “no autocorrelation” exists among the errors or not.

Advantages:

  • It’s an easy way to confirm existence of autocorrelation in residuals of regression analysis.

Limitations:

  • In certain scenarios, the DW test can be inconclusive.
  • When lagged dependent variables are included in the predictor variables, then it is inappropriate to use this test.

Example:

In section below, a case is considered where ‘Estimated_Total_Retail_Sales’ figure (Target) for a retail chain of stores is to be predicted using predictor variables: store size (area), locality of store, previous financial year’s data variables such as TotalFootprint, NetSalesPromoPeriod, NetSalesNonPromoPeriod.

Outcome of Linear Regression ran for ‘Estimated_Total_Retail_Sales’ (target) shown that below predictor variables are significant: LocalityType, SAREA, FY16TotalFootprint, FY16NetSalesNonPromoPeriod, FY17NetSalesNonPromoPeriod, FY15NetSalesPromoPeriod, FY16NetSalesPromoPeriod, FY17NetSalesPromoPeriod.

DW test is run on above mentioned predictors and dependent variable as ‘Estimated_Total_Retail_Sales’.

Input:

Dependent Variable: Select a variable which was considered as dependent variable for regression analysis. It has to be a continuous variable. Eg. In this case it is ‘Estimated_Total_Retail_Sales’.

Parameters:

Dependent_Variables :EstimatedTotalRetailSale

In ATH, to run the function you need to select the columns of the data and then use the path: Machine Learning è Regression Analysis (Linear) è Durbin-Watson Test

Output and Interpretation:

Output of the function shows below table with contents as: Autocorrelation value, DW statistic, p-value of the test. (i.e. probability of getting a value> | Autocorrelation Value | ). At 1 % level of significance, we conclude that H0 (Autocorrelation value = 0) can be rejected if p-value is ‘0.01

As shown in table above, we see that p-value is 0.012. At 1 % level of significance, we do Not reject H0 (Autocorrelation value = 0). This can also be confirmed by general rule of thumb which says that DW statistic value of 1.5 to 2.5 can be considered as normal. As shown in table above, we get DW statistic as 1.63.

Read about:

Breausch -Pagan test on ATH LEAPS

--

--

Analyttica Datalab

Analyttica Datalab (www.analyttica.com) is a contextual Data Science (DS) & Machine Learning (ML) Platform Company.