Covid-19 data for NSW

Mark Monfort
Prosperity Advisers DnA
3 min readApr 1, 2020

With all the corona virus data out there, I’ve come across a number of sources and ways of getting the data.

The initial work was done using Johns Hopkins University data made available on their GitHub repository (LINK). However, this only had cumulative cases for confirmed, recovered and deaths due to Covid-19.

The only other sources I saw required PDF or more detailed web scraping so weren’t suitable to build something quickly (remember, this work isn’t my day job, but it’s good to showcase what analytics can do).

New Data Source

Anyway, a friend of mine (Julia Lee of Burman Invest) pointed me towards the NSW Health dataset found here (LINK). This site has 3 sets of data relating to location, age range and infections.

As of writing, the location data only goes up to the 30th March and the age range and source of infection data don’t have as full a dataset as the location data (the latter 2 show only 1,974 cases whereas location tables have 2,032).

Still though, it’s an interesting dataset and shows how the myth around young people being immune is just wrong

Also, what I found interesting was looking at where locations of cases were. I have not added a map to this yet as native Power BI did not like the naming conventions of the LGA/LHD codes so will need to do a bit more work there. However, what was interesting was that Northern Sydney has the 2nd highest set of confirmed cases by LHD group

The App

If you want to access the app you can do so here: LINK.

Interactivity

Each pages charts are interactive. So, for example, I can select Hunter New England from the first bar chart on the Area page and see how its overall time series looks and which specific areas rank highest within that LHD (in this case it’s Newcastle).

On the Source page, doing this means I can click on a specific source (like Locally Acquired) and see what the time series progression is for that type of infection group.

For the Age group section, instead of filtering the time series chart, we use what’s called ‘brushing’ to highlight what proportion of the overall chart is taken up by our selection. In the example below I’ve used the CTRL button to click and highlight a number of age groups (20–24, 25–29, 30–34) to see how much they make up of the whole time series. You can see that this group of age brackets had their peak confirmed cases on the 25th/26th (around 10 days after the infamous Tropicana party held on the 15th in Bondi and blamed for a spike in cases).

If you would like to speak to myself or any of the other Prosperity Advisers team about this or our other services then please get in touch

Contact Details

Mark Monfort (Head of Data Analytics and Technology)

  • Phone: 02 8262 8700
  • Email: mmonfort@prosperity.com.au

--

--

Mark Monfort
Prosperity Advisers DnA

Data Analytics professional with over 10+ years experience in various industries including finance and consulting