QGIS and ABS
In this post I am going to show you how to use QGIS to visualise the change in some stats from Census 2011 to Census 2016.
This is pertinent in the context of the Greater Sydney Commission’s desire to create Three Cities within Sydney. These Three Cities will allow people to ‘realistically achieve the goal of being able to live within 30 minutes of where they work, study and play. This makes life more liveable and way more productive and sustainable for everyone.’
The change between Census 2011 and Census 2016 will allow us to see the direction Sydney has been moving over the last 5 years, and compare that with the bearing desired by the GSC.
First thing let’s get the data.
Head to https://datapacks.censusdata.abs.gov.au/datapacks/. Select the ‘Postal Areas’ geography (which unlike the SA1’s remains the same from census to census) and then download the NSW datapack and the Esri Shapefile Geography Boundary. Once you’ve done that select the 2011 Census Datapacks from the Census year dropdown and download the NSW datapack. You don’t need the 2011 geography.

Unzip the whole lot into a folder — you’ll have to rename some of the subfolders, which have the same name for both censuses.

Now, download, install and fire up QGIS!

To get a good looking map is very easy in Q. From your data folder just drag and drop the POA_2016_AUST shapefile onto Q. A word on shapefiles. They are strange beasts. A shapefile is actually multiple files. You will notice you have a POA_2016_AUST with a .cpg, .dbf, .prj, .shp and .shx extension. The shapefile is the one with the .shp extension and that is the one you should drag into Q. That file needs the others in the same folder to work properly. It is particularly sensitive to the .prj and the .dbf which are necessary (the presence or absence of the .cpg or .shx has never worried me.)

We have geography! All the postal areas in Australia, including that one in the middle that looks like it would take a month for a postie to travel across. ABS data comes in separate files in order to make it less cumbersome: the geography is stored separately to the data tables and the data table you are interested in needs to be joined to the geography table, like in sql. To get there let’s load some data tables (which have no geography) into Q.
From the Layer menu select ‘Add Layer’ and then ‘Add delimited text layer’.

From the menu browse to the ‘2016 Census GCP Postal Areas for AUST’ folder you downloaded and select table G02, whose full name is 2016Census_G02_AUS_POA. Table G02 is the easy table, full of interesting things the ABS has already done some processing on. It is called Selected Medians and Averages. The other tables are far more raw and difficult to deal with.

Make sure you select CSV as the file format and click ‘No geometry’ or else Q will try and find some columns which represent X and Y values and make points out of them. In this case the geography geometry is stored in a separate file which we have already loaded. Click OK and the aspatial layer will be created in the layers tab.

Now everything is in there, let’s join the two together using what Q calls the ‘Join’. Right click on the geometry layer and click on Properties in the contextual menu. In that menu select the Joins and click the green plus sign to add a join.

Joins require a common field, or column, in the two data tables to be joined. That way the machine knows which row from one table joins onto which row in the other. We are using the Postal Area geography and the common field will be postal area code — POA_CODE — or some derivative of it.

We now have data and geometry and we have connected them. All we have to do is style the geometry display colours by the data and we will have something useful. In the properties menu for the geometry layer click on the Style menu. Instead of ‘Single Symbol’ select graduated which will allow you to style the geometry dependent on some field. Choose a field from the drop down menu (I chose Average_household_size) and select ‘Natural Breaks (Jenks)’ in the mode.

Click apply and zoom to Sydney to see what you can see.

Immediately you can see that Parramatta currently has a very different demographic character to the Eastern CBD. But let’s look at what income has been doing from 2011 to 2016 across Sydney.
We have the 2016 data loaded already, so all we need to do is load the 2011 data. Follow the exact same steps of loading the 2016 CSV but instead load table B02 from 2011. Add it as a join to your geometry layer — one layer can have many joins.
Having done that go back to the Style menu on the geometry layer. Notice you now have the 2016 and the 2011 fields.

QGIS supports expressions in the Column option. To add an expression click on the little backwards ‘3’ which is an epsilon and I think stands for expression, but in Greek. This takes you to the ‘Expression dialog’. In the menu expand the ‘Fields and Values’ dropdown to see what fields you can use in and expression. Double clicking on one adds it to expression in the right format (enclosed in double quotes). The expression syntax is very straightforward. I want to see the difference in income between 2016 and 2011 as a percentage of 2011 income. Something like:
(“inc_2016”-”inc_2011")/”inc_2011"

Once you have added the expression you can change the colours and number of classes to try and tell the story. Once I fiddled around a little I got this map, which shows the percentage growth of median weekly personal income.

The darker orange pools are areas of strong income growth, and the blue is where income growth has been meagre. There is definitely a story of gentrification of the not so inner West, from Marrickville through Earlwood to Flemington Park, in the North from Balgowlah and Beacon Hill, and Botany, Chifley and Little Bay in the South. My suspicion is that the already gentrified areas — Summer Hill, Manly, Coogee and Maroubra — are spreading out. It is interesting that 2 of these areas are not on rail lines. These are all feeder suburbs to the jobs of the (Eastern) CBD and indeed lie within the blue blob that is the GSC’s Eastern City.

Parramatta itself has seen a rapid increase in income as have Ermington and Auburn. This might suggest the GSC has a trend on their side, and that Parramatta is creating more higher paid jobs and becoming a strong CBD, but my understanding is that when you look at the Opal Card Data, Parramatta is a dormitory suburb for the CBD. This is almost certainly true of Auburn as well. The story is not entirely negative though — incomes are growing in the West.
Interestingly students have seen no income growth at all, if Chippendale and the UNSW precinct are representative. This matches my understanding of labour market conditions for young people.
