The COVID-19 Spread in Australia

Peter_Robertson
13 min readApr 17, 2020

--

A modelling study using an implementation of the SIR disease method in Excel/Visual Basic

By John Edwards, April 2020

John Edwards, a friend and colleague of some forty years decided to apply some rigor to his understanding of how this event is likely to unfold and eventually resolve across our island nation of Australia. I am pleased that he has agreed to allow me to share this. We welcome comments and feedback and hope that this work helps make better sense of both the situation itself, and the confusing, simplistic and often contradictory reporting we’ve all been subjected to.

Enjoy.

Peter Robertson

Very little information

In mid-March 2020 I was curious as to how the coronavirus epidemic was evolving. While the Australian Government was showing pictures of curves and talking about the need to ‘flatten’ them in order to reduce the impact on our health system, there didn’t appear to be much information regarding really important issues such as magnitude and duration, i.e. how many people will end up getting infected and how long will the epidemic last?

To address this situation I resolved to create and maintain my own coronavirus model, which is documented in the remainder of this presentation, including data sources, assumptions, methods and findings.

Dr Tom and the SIR disease model

My starting point was to look at the mathematics of epidemics, and this led me to the excellent and seriously enthusiastic online presentations of Dr Tom Crawford from Oxford University (tomrocksmaths.com).In particular, I came across this: Oxford Mathematician explains SIR disease model for COVID-19 (Coronavirus)

The Susceptibles-Infectives-Recovered (SIR) model described in the video is appealing because it provides a simple way to address such concerns as:

  • What will be the maximum number of people who have the disease at any one time (the ‘Infectives’)?
  • How many people will end up getting the disease (the ‘Recovered’ and the hopefully small number who die as a result of the epidemic)?
  • How long will the epidemic last?

Although there are many other ways to model epidemics, I was keen to keep my efforts as simple and practical as possible, so the model presented here is based on the SIR. While it is possible to run a number of simultaneous SIR models, one for each state, it would require the entry, modelling checking of considerably more data than for a simple ‘lumped’ model for the entire population, as presented in this study.

From differential calculus to Excel

While Dr Tom’s explanation provides an introduction to the SIR, it doesn’t explain how to apply its differential equations to real-world data available on the internet, which is typically in the form of daily figures for the number of people infected, recovered and so on.

Looking deeper, I found the following article, which bridges theory and practice and provides all the details necessary to implement an SIR model in Excel: http://www.pandemsim.com/data/index.php/make-your-own-sir-model/

Excel frame-grab from the pandemism.com web site

SIR — a brief description

Before going further, here is a brief description of the SIR model. Although very elaborate models have been developed to simulate the spread of diseases, the SIR is one of the oldest and most commonly applied, and it works by dividing the population at any instant into three compartments with corresponding values indicating their current proportions of the population:

•Susceptibles — initially set to the entire population (since the virus is novel and everyone is therefore susceptible), the ‘Susceptibles’ figure represents individuals yet to be infected, and it subsequently falls as the epidemic progresses.

•Infectives — initially set to 1, since there must be at least one individual in order for the disease to be present, and subsequently represents those infected by other ‘Infectives’. This figure tends to rise to a peak value before declining as the epidemic progresses, and is responsible for the ‘bell’ shaped curve which politicians so often describe as needing to be ‘flattened.’

•Recovered — initially set to zero, since the disease is yet to spread, and subsequently rises as the epidemic progresses.

As the model steps through time, and depending on the rate of infection ‘β’, some ‘Susceptibles’ will transition to infective status, while some of the ‘Infectives’, in turn, will transition to ‘Recovered’ status depending on the rate of recovery ‘γ’:

Typical SIR curves

The following chart shows typical SIR curves (plotted day by day), in this example for a hypothetical influenza epidemic spreading through the Australian population:

Initial conditions

As indicated above, the initial S, I, R values for a novel virus epidemic are obvious:

The initial figure for S, of 25,641,709 as of 29 March 2020, was obtained from the Australian Bureau of Statistics web site:

https://www.abs.gov.au/AUSSTATS/abs@.nsf/Web+Pages/Population+Clock?opendocument&ref=HPKI

In the aforementioned ‘Typical SIR curves’ example, the values for ‘β’ & ‘γ’ were set at 0.297 and 0.072 respectively, and were derived from previous data regarding influenza outbreaks, but what about the rates of infection and recovery for the coronavirus? In SIR math, the number of days to recover is represented by ‘Y0’, which is the reciprocal of ‘γ’. The value for ‘Y0’ appears to be about 17 days (according to data obtained from the ‘Data b’ website mentioned below), which is slightly longer than the widely reported figure of 14 days. Dividing 1 by 17 results in a figure of 0.059 for ‘γ’. This leaves us with the need to determine a value for ‘β’, which can be found in many ways, and in this case I used the following method.

The search for ‘β’ — source data

Estimating the rate at which the coronavirus is spreading involves the analysis of existing data, and in late March I came across two sources, which I labelled as follows:

•Data a: https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert/coronavirus-covid-19-current-situation-and-case-numbers

•Data b: https://www.covid19data.com.au/

I found ‘Data b’ to be more convenient and adopted it as my prime source of daily coronavirus data.

Later on I examined a third source, which was used mainly to check the derivation of additional data from ‘Data b’:

Data b — cumulative confirmed cases

As of 24 March, ‘Data b’ contained figures for the ‘Cumulative Confirmed’ cases of coronavirus infection, as depicted in the following graph:

Data b — derived values

Values for ‘Daily Confirmed’, ‘Daily Recovered’, ‘Cumulative Recovered’ and active ‘Infectives’ were derived from the ‘Cumulative Confirmed’ cases data:

The active ‘Infectives’ values are of particular interest, and have been used extensively in this document to help determine the epidemic’s progress.

Calcs for derived values

The following Excel spreadsheet fragment shows the equations and incremental methods used to derive the aforementioned values:

Data b — Infectives

During the onset the epidemic, the ‘Infectives’ curve doesn’t seem to be much different from that of the ‘Cumulative Confirmed’ (which is understandable since not many ‘infectives’ would have recovered at that stage):

Data b — Infectives plotted on a log scale

Although the ‘Infectives’ curve appears pretty flat for the better part of the graph depicted above, plotting the curve on a logarithmic scale reveals a different story:

About that line

Clearly the above graph contains key information about the initial spread of the epidemic, and if we look closer at that apparently straight line fitting neatly through the data points, it is possible to use it to determine a value for ‘β’.

The left hand graph below shows the ‘fit’ line translated back to the origin (to simplify the math and make analysis easier) with a superimposed dashed Excel trend-line and its corresponding formula.

As the formula suggests, the line is actually an exponential curve (remember the ‘Y’ axis is logarithmic, and this makes the curve look like a straight line). So when the ‘Y’ axis is changed back to linear, the shape of the curve becomes apparent, and its similarity to the shape of the ‘Infectives’ curve is striking:

Rearranging the formula

The aforementioned formula tells us that the rise in the number of ‘Infectives’ between 16 February and 24 March is an exponential function of time, and this can be ‘flipped’ or rearranged to solve for time as a function of the number of ‘Infectives’:

Having a formula for time as a function of the number of ‘Infectives’ makes it possible to determine ‘R0’, which represents how many days it will take to double the number of ‘Infectives’ in the SIR model.

The following Excel fragment shows what happens when the time formula is entered into an Excel spreadsheet:

So it takes about 3.426 days for the number of ‘Infectives’ to double from one to two, and and this also happens to be the value of ‘R0’.

We could just as easily have substituted 3 & 6, 10 & 20, 14 & 28 or any other pair of ‘Infectives’ values for 1 & 2; so long as the second figure is double the value of the first, the difference between the associated time values is always the same:

In SIR math, ‘R0’ is the reciprocal of ‘β’. Therefore, dividing one by ‘R0’ gives 0.292, which is the value for ‘β’ in this case.

SIR between 16 February & 24 March

The following graph compares the output of an SIR model, which uses the 17 Feburary starting date and ‘β’ value determined above, along with the known ‘Infectives’ data for the same period. As indicated, the two curves are a close match:

SIR on a log scale

Switching once again to a logarithmic scale, the chart below shows that the purple ‘Modelled’ ‘Infectives’ line (which is really a curve) closely resembles the ‘fit line’ previously used to determine the X-intercept start date and ‘β’ value, and this agrees with the aforementioned observation that, between 17 February and 24 March, the spread of the coronavirus in Australia was increasing exponentially:

If no action had been taken

The curves below result from running the SIR model into the future with the aforementioned settings, say for about a year. This indicates is that if no action had been taken to slow the epidemic, the Australian health system would have been utterly overwhelmed by mid-May, where the number of ‘Infectives’ would have approached 11 million people, and by the beginning of August the epidemic would have impacted nearly all the population. Chances are that many of the individuals represented the grey ‘Recovered’ curve would not have survived such an onslaught. Let’s call these the ‘worst case’ SIR curves.

A change in the curve

Fortunately, after 25 March the slope of the ‘Infectives’ curve started to reduce, as indicated by the black dotted line in the graph below. It was no longer following the predicted ‘worst case’ benchmark path. On 19 March the Australian national border was closed to non-residents and non-citizens, and on 20 March the government announced that gatherings of less than 100 people would be allowed on the proviso that “there would be 4 square metres provided per person in an enclosed space”. Outdoor gatherings of more than 500 people were also banned indefinitely, and the distancing rules further required the closure of cinemas, theatres, restaurants, cafes, pubs and clubs, along with restrictions on the numbers attending wedding and funerals. The lock-down further increased as the states closed their borders in late March.

The delay & borders

The above graph indicates a delay of about six days between the instigation of measures to limit the coronavirus spread and the change in slope of the ‘Infectives’ curve, and this may be due to the incubation period, which, according to the World Health Organisation, is typically five days.

Given that social distancing measures had already been in place prior to 19 March, perhaps Australia’s single most important response to the epidemic was to close the national, and thereafter the state, borders. It would appear our unique island nation, with its large area, simple arrangement of state borders and small population is our best defence against the epidemic, and everything else we do can only enhance the outcome.

4/5 April — a turning point

Between 19 March and 4/5 April, a period of about 16 days (which is very close to ‘Y0’, the number of days to recover) the slope of the ‘Infectives’ curve continued to reduce, to the point where it reached zero some time between 4 & 5 April, and by then the number of ‘Recovered’ individuals was starting to rise rapidly. This situation is graphically illustrated in the following chart:

Continuing descent

Since the turning point between 4 & 5 April, when the number of ’Infectives’ peaked around 4,800, the corresponding curve has consistently descended, and was around 3,500 by 11 April. Conversely, the ‘Recovered’ curve continued to rise, and was around 2,800 by 11 April.

More about β

Since the slope of the ‘Infectives’ curve was consistently reducing from 25 March onwards, this implied that the value of ‘β’ was also reducing (or decaying, perhaps), and this is graphically illustrated in the chart below.

As previously explained in “ ‘R0’ & ‘β’ ”, the value of ‘β’ prior to 25 March was determined by analysing the ‘Infectives’ curve between 17 February and 24 March. From 25 March onwards, a Monte Carlo algorithm (which I coded in Visual Basic) was used to automatically converge on new daily ‘β’ values (see Wikipedia’s ‘Monte Carlo Algorithm’ entry for more on this).

Projecting β

From about 28 March onwards there were sufficient reducing ‘β’ values for Excel to determine a corresponding curve of best fit. The resulting logarithmic approximation ‘β’ made it possible to predict future ‘β’ values. To demonstrate this, the following chart plots ‘β’ values as of 1 April, together with the resulting fitted and projected logarithmic curve:

Predicting — as of 1 April

Having an approximation of ‘β’ extending into the future made it possible to run an SIR model on a daily basis, and this in turn provided predictions of the date where the ‘Infectives’ figure would peak, along with the date when the ‘Recovered’ figure stops increasing (i.e. when the epidemic has apparently ended). The following chart shows such a prediction based on the ‘Infective’ figures and projected ’β’ as of 1 April:

β’ as of 11 April

By 11 April the ‘β’ value was pretty much at zero:

Predicting– as of 11 April

The following chart shows predicted ‘Infectives’ and ‘Recovered’ curves based on the derived ‘Infective’ figures and projected ’β’ as of 11 April:

Day by day ‘Infectives’

The composite chart below shows ‘Infectives’ curves projected from 30 March (outermost) through 11 April (innermost). While the curve for 11 April has a lower peak value and trends toward zero more rapidly than any other, there has not been much change, overall, and it is therefore reasonable to conclude that the epidemic has receded very significantly.

Day by day ‘Recovered’

The composite chart for ‘Recovered’ also graphically confirms that the epidemic is receding:

Predicting total ‘Recovered’

A graph of the predicted total ‘Recovered’ individuals indicates the figure will settle at around 8,300 some time in July/August:

Deaths

Although this study does not show figures for those who pass away during the epidemic, deaths can be inferred as a proportion of those who recover:

Summary

Here is a summary of key findings in this study:

  • The total number of active ‘Infectives’ peaked at around 4,800 on 4/5 April and has declined significantly since then.
  • The total number of ‘Recovered’ should plateau around 8,300 individuals some time in July/August.
  • Around 150 people may die due to the epidemic, which represents about 0.0006% of the Australian population.

So long as we remain vigilant over the next six months at least, it appears the Australian population will experience significantly less impact than many less fortunate nations around the globe.

--

--