Known Thy Province

Canadian Provinces & Territories by Income Tax Data

Anders Ohrn
5 min readJan 26, 2017

Income tax: duty, political battleground and a steady source of data on society.

The Canadian government has committed to the International Open Data Charter . The charter consists of six principles on how to make data open in order to further civic engagement and economic value creation. In short, data is recognized to carry potential for good, but without the right framework to access and understand the numbers, the potential is buried.

I will use income tax data between 2000 and 2014, accessible through the government data portal, to illustrate a handful of trends. In particular, I segment the analysis with respect to the Canadian provinces and territories. Logical rigour shall rule supreme — I wouldn’t want to reinforce stereotypes about those other people

How are income extremes related?

Income differences matters to society, not to mention a favourite topic of ideological debates and mud fights. This time around I put aside the soap box and moral philosophy, and study the data by itself.

The relation between the percentage of tax filers with a high annual income (above $100,000) and the percentage of tax filers with a low income (between $15,000 and $25,000) is shown over time and by province in the animated scatter-plot below.

Income distribution tails. Size of disk is proportional to total population.

The two income brackets are negatively correlated across provinces: if one number if higher, the other is lower. As the years pass the disks trend mostly right and downwards, but with year-to-year fluctuations, see Ontario, Quebec and British Columbia in particular.

The provinces and territories are also spread out. Alberta for example, home of the oil sands, takes a few big leaps to the right 2006 and after — hardly surprising that this is a period of high global oil prices. We have to wait a few years to see if the recent slump in oil price will become evident in the above chart.

The data can be further segmented on geographical areas (cities and towns). For 2014 the scatter-plot looks like this:

Income tails for 2014, selected census metropolitan areas are highlighted. Colour by province.

The aggregate disks of the earlier plot are split up, and populous urban centres stand out. Intra-province relations are evident on closer inspection. One example: I expect geography buffs are perturbed by the medium-sized brown disk in the upper left. The colour implies the disk corresponds to a place in Newfoundland & Labrador, but it is not St. John’s, which is a disk between Toronto and Edmonton. Instead, this outlier disk corresponds to an aggregate of non-census defined areas — think of it as a broadly defined rural part of the province. Newfoundland & Labrador has experienced a rocky economy with natural resources not being quite that reliable as a source of wealth. At least in 2014, this may be a reason why the rural parts deviate as much from other areas of Canada — St. John’s in particular.

An important note when looking at the above income data: no normalization with respect to cost of living has been done. Money is a symbolic number. It is through a transaction that it becomes a product or service. For example, the tiny dot representing the Northwestern Territories in the animated diagram is moving far to the lower right. However, NWT is notoriously expensive, so what quality/quantity of products and services does that money really buy? In order to resolve that question, other data has to be cross-referenced.

Age demographics by time and province.

The Age Demographics

Part of understanding income, current, past and future, is to understand the age demographics. The young and old have less income, but use the lion’s share of public funded services, healthcare in particular.

In the table the percentages of age cohorts per province and year are shown. From a high level, the shade of amber reveals the general ageing of Canada: the two lowest age cohorts decrease in relative terms with time, the two highest cohorts increase. The increase is steady but slow.

This trend holds across all provinces, but from different starting points. I note that the three Prairie provinces have a somewhat larger share of young people than other provinces. Is it the wheat, the grass-fed beef, the icy winters? I leave the speculations for another time.

Median Income — for Men and Women

The income data shown above is concerned with the tails of the distribution. The median is a metric that track the centre of the distribution. Again I segment data on year and province/territory, and this time also by gender.

A few observations:

  • With the exception of a slight dip between 2008 and 2009, the median income has been increasing for all provinces and genders.
  • The Prairie provinces all have medians that go from below Ontario and Quebec to above as the years pass.
  • The median income for women is lower than that for men, but to different degrees across provinces; Prince Edward Island and Alberta differ in this regard. Again, exact reasons must be sought in other data sets and studies.

Open Data, Hurrah!

The above discussion is a brief inspection of one of many open data sets made available through the Canadian Open Data Portal. It takes slicing, dicing, visualization and thought to extract anything but the most trivial relation. The bigger picture of analysis is captured in the DIKW-pyramid. Data is the foundation, and only the foundation, of information, then knowledge, then wisdom. Open data is a great starting point, but a lousy ending point. Analytical prowess must fill the gap — a duty as well.

If you liked this, be kind and clap.

--

--

Anders Ohrn

Quantitative if possible, towards first principles, pragmatic always. Innovation, biology, computation & complexity.