Spurrious correlations or lead indicators

Mark Monfort
Prosperity Advisers DnA
4 min readApr 19, 2020

I used to look at correlations in my daily job, a lot. It was typically correlations between macro-economic time series and the business performance of companies listed on a stock exchange.

Recently, we’ve seen a lot of news about unemployment figures in the US and how that’s been rising to level exponentially higher than seen before. The figure used to typically measure this is called Jobless claims and it topped 20 million people for April.

Trading Economics on Twitter: https://twitter.com/tEconomics/status/1250763468552224768

In Australia, we also have unemployment data which is available as part of the Labour force data collected by the ABS (Australian Bureau of Statistics — LINK). I went to the Downloads section where they have Excel files which you can use to see this time series. In particular I focused on this table “Table 1. Labour force status by Sex, Australia — Trend, Seasonally adjusted and Original”.

To see whether or not the above is a lead indicator for the rise or fall in other parts of the economy, I also grabbed Retail Trade (Sales) data also available from the ABS (LINK).

To compare growth in separate time series like this (which measure different units, 1 being persons the other being dollars), it is best to convert both to a year on year growth figure. Both series are monthly so we get monthly year on year growth figures (e.g. this month divided by 12 months ago minus 1).

Doing this for the unemployment data gives us this chart

The retail trade data contains sales data on Food, Household Goods, Clothing and Personal Wear, Department Stores, Other Retailing and Cafes/Restaurants. Out of this group it was only Household Goods that appeared to have any sort of relationship but its not immediately apparent.

If we plot the year on year sales growth of Household Goods with the Unemployment data we had above, the chart looks like this and only has a correlation co-efficient of 21%.

However, if we shift the unemployment back by 5 months, then the correlation coefficient goes up to 54%.

Could this mean that rising unemployment leads to more sales for Household Goods? Maybe with a lagged effect it does, but you’d need to do a lot more work than the above to realise that.

You’ve probably observed this as you’ve gotten ready to work from home as I have and waited in lines at JB Hi-Fi or Bunnings that demand for those things has gone up but it’s interesting to see how this plays out in the numbers as well.

What does Household Goods contain? In the explanatory notes the ABS defines it as retail trade pertaining to the following

  • Furniture retailing (4211)
  • Floor coverings retailing (4212)
  • Houseware retailing (4213)
  • Manchester and other textile goods retailing (4214)
  • Electrical, electronic and gas appliance retailing (4221)
  • Computer and computer peripheral retailing (4222)
  • Other electrical and electronic goods retailing (4229)
  • Hardware and building supplies retailing (4231)
  • Garden supplies retailing (4232)

If you’re interested in doing this type of lead/lag correlation then you’ll need to use not only the CORREL function in Excel but also the OFFSET one too so you can shift your time series by a number of periods.

This is certainly not a sign that you should use on its own either, the results could be spurrious. In fact there is a great website showing spurrious correlations you can see here (https://www.tylervigen.com/spurious-correlations). Where else can you see that pool drownings is correlated with Nicholas Cage movies?

Anyway, if you want to know more about any of the above data or how to do correlations or lead/lag them then feel free to get in touch with me here.

Mark Monfort (Head of Data Analytics and Technology)

  • Phone: 02 8262 8700
  • Email: mmonfort@prosperity.com.au

--

--

Mark Monfort
Prosperity Advisers DnA

Data Analytics professional with over 10+ years experience in various industries including finance and consulting