Is R0 Actually Meaningful for COVID-19



Mathematical modelling and their predictions are used in many fields in the present and the same strategy comes in handy in studying the nature and growth of pandemics as well. Studying the exponential growth, behavioural patterns of the society, modelling the strength of such a disease are some examples for similar use cases. ‘Basic Reproduction Number — (R0)’ is one such parameter that is described in epidemiology which can be used to model, describe and predict the current status of a pandemic. Let’s take a look at what the R0 represents in brief and its connection with the current Covid-19 crisis.

R0 is the average number of secondary infections generated by the first infectious individuals in a population of completely susceptible individuals. If we describe this a bit more using less technical jargon, R0 is the number of new cases that an existing infected person can produce on average over the period in which he/she remains sick. This definition is applied mostly for a population that has not been exposed to the virus before and has not been vaccinated for that particular illness (No pre-existing immunity). If we represent the R0 value using a ratio, it is the ratio between ‘the new cases reported on a specific period’ and ‘the cases reported on the previous period’. So, R0 values which are is less than one would indicate a good situation where the new cases reported are only a fraction of the early cases which shows that the situation is getting better and the outbreak will end on its own. If the R0 value is constant at one, it means that the virus will neither create chaos nor be removed from the society but will maintain its infections at a constant speed where the number of new cases is the same as the previous ones. Finally, if the R0 is greater than one, it indicates that the virus is at an exponential growth stage and precautions must be taken for the safety of the susceptible crowd. Let’s take a look at a comparison between the recorded R0 values of some of the most common such infections in the past with the novel Corona Virus.

Figure 1: R0 values for different infections (Image was created by the author)

Different Approaches Used to Model an Epidemic

Modelling an epidemic can be done in several ways. Statistical model, Susceptible-Infectious-Recovered (SIR) model, and Agent-based model are some of them. Statistical models predict the outcome using statistical techniques by using the observed data in that time period. These models do not usually contain a parameter similar to R0. These types of models are rather useful in making short-term predictions, and they fail to capture the dynamics such as changing contact rates and disease transmission to a certain extent. Agent-Based models use census data, public transportation data, survey, and mobile phone data to develop models for the situation considering an approach which simulates individuals as agents of a larger network. Different people use different agent-based models for their purposes and some of them compute a R0 value for the model while some only use rough estimates to identify the impact of R0 for different social conditions. Both these agent-based approaches use a R0 value per agent, not on a whole population.

SIR models divide the population into three categories as ‘susceptible’, ‘infectious’, and ‘recovered’. These may also include classifications as ‘exposed but not yet infectious’, asymptomatic’, or ‘dead’ depending on the situation. SIR model uses a set of differential equations for modelling purposes. These include three main differential equations (differential equations describe the rate of change of one parameter with respect to another) which describe the rate of changes in ‘susceptible’, ‘infectious’, and ‘recovered’ populations. Using these three equations we can calculate the R0 value for the model. Probability of infection, contact rate, and the period over which an individual is infectious are considered as the parameters and here the computed R0 value represents the whole population unlike for an agent in the SIR model.

On the other hand, epidemiologists calculate the R0 value tracing the individual-level contact data in the midst of the pandemic just as described in the formal definition of reproduction number. For this, once an individual is infected, his/her possible contacts are immediately traced and tested. Thereafter based on those results, R0 is computed by averaging over the number of secondary cases caused by many diagnosed primary individuals. Because of the reasons described later, this calculated value does not provide a very reliable perspective as an epidemic threshold parameter. When we use such techniques, the computed value only represents an average value related to the issue and this can also be called a discrete version of what we expect from it. Most of us confuse this parameter as a rate with T-1 dimensions when R0 is a dimensionless number.

Factors Affecting R0

The R0 value depends on three main factors and they are,

  1. Infectious Period — Duration of infectiousness of a patient
  2. Shedding Potential/Mode of Transmission — Probability of infection being transmitted related to the mode of transmission
  3. Contact Rate — Average rate of contact between infected and susceptible individuals.

R0 can also be represented by the multiplication of numerical values given to each of these three factors. This also describes an average resultant value,

(R0 = Average rate of contacts × probability of infection × duration of infectiousness).

When we consider each of these factors, we know that society is more vulnerable when these factors take relatively high values. In this way the R0 value tends to give some control and perspective on how the isolation protocols must be deployed and when should we reopen a certain area under normal conditions. On the other hand, this value can give a rough idea of how long should the tests must be done before the close contact persons are allowed to be exposed to the public as well. Talking about the shedding potential, it is relatively higher for the coronavirus because airborne diseases tend to have rapid speeds of infection. As an example, Hepatitis B/C and HIV have relatively low shedding potential compared to these types of airborne diseases because they require the contact of bodily fluids for infection.

Validity and Usefulness of R0

Now that we have an idea on the Reproduction number R0, let’s take a look at how this parameter might actually interpret the nature of a pandemic. What we must realize is that the R0 value is extremely difficult to accurately measure until the end of the pandemic. The values calculated in Figure 1 have been calculated to a greater accuracy only after that respective situation was over because once the pandemic goes on new data is produced and the values constantly change depending on many other reasons. As an example, the R0 values which were calculated by 11 different studies (Figure 2) showed very different values in the early days of the Corona Virus [1] [6]. One of the reasons for this difference is the lack of important data at the early stages and the use of the same models which was used for other pandemics considering similarities of one another.

Figure 2: R0 mean and range estimates from 11 different studies of COVID–19 as a function of time ([1])

As mentioned above, the R0 value is difficult to be accurately approximated as it highly fluctuates with the changes in the above mentioned three factors. These values are highly sensitive to local factors and because of that, the R0 value calculated in one country might not apply to some other country as the neighbourhoods, culture, and -the nature of social conducts drastically differ from one another. We must be extra careful when we select a model due to the same reason because if we select an erroneous model and take decisions based on that, the situation could turn worse as well.

As we know R0 is calculated for a completely susceptible population meaning that they have not been protected by any form of health measure. But the situation is different in real life because when the symptoms are identified in at least a few citizens, rules and regulations are imposed for the safety of the rest of the society and that makes the whole population less susceptible. Therefore, epidemiologists typically use two ‘R’ values as R0 and the “Effective reproduction number — Re” which measures the effective reproduction number when the virus becomes more common and as public health measures are initiated. As you might already understand, the Re value is usually much lower than the R0 value and Re value gives us better insight about the whole scenario than the “Basic reproduction number R0” in the long run.

The characteristics of the Re are the same as R0 and having a Re value more than one might indicate a dangerous situation while a lesser value suggests that the society is healing and less prone to another exponential growth of unfortunate events. With the use of these values, we can get an idea about where we stand in terms of the exponential growth curve. If the Re value has reached one, we say that we have reached the ‘Inflection point’ where the situation is moving towards the end of the outbreak resulting in lesser Re values along the way. If the trend is stable around a smaller Re number for a few days, we could assume that we are approaching the region where the virus will no longer spread to newer populations. Policymakers making decisions using the Re could be seen in Germany where they started reopening country town by town when the Re reached a value of 0.7 [2]. Even if the Re value serves a better perspective, it is still not a very good estimate because there still might be incomplete data on reported cases and deaths as well.

When we talk about the factors which affect the spread of such diseases, factors like the strength of the virus, transmissibility of the virus, the age category of the exposed people, how well they have been vaccinated, other non-communicable diseases they are diagnosed with (which largely affect their immunity levels) and their overall lifestyles throughout the years also should be taken into account. These types of factors are related to the probability of the exposures becoming actual infections and the behavioural patterns and other habits affect the probability of exposure to an infectious person.

Furthermore, R0 is heavily dependent on many factors such as the method of identification used by the responsible authority, the number of random tests carried out on a day, the total number of tests carried out on a single day, the validity of the test results, the geographical locations of the identified patients, how well the patients are isolated, safety measures are taken by the general public for better protection, the effectiveness of the methods used for creating public awareness and society’s’ attitude towards the current crisis. If we look at an example for the previously mentioned geographical location factor, if a group of people with the virus are living in an isolated area already, even though they are unidentified as sick, they will contribute to the rise of the R0 for few days and not long-term.

Super Spreaders and Silent Spreaders

Super spreaders and silent spreaders are two other scenarios which could vary the R0 value out of the norm. When we consider these types of pathogens, certain infected individuals do not show any symptoms at all. These types of people go unnoticed due to the lack of visible symptoms and they are often called carriers or asymptomatic carriers. One such person can infect a lot more than compared to a symptomatic carrier. Especially in this Covid-19 case, studies have shown that there could be a considerable amount of asymptomatic carriers compared to previous pandemics such as MERS or SARS [3].

Therefore, the estimates of R0 that have been calculated without the asymptomatic cases, will be systematically biased. Now let’s take a look at the ‘Serial Interval- SI’, another parameter which helps us understand such a situation better. In epidemiology, the serial interval represents the time interval between the moment one person gets infected, and the time is taken for him/her to create a subsequent case on their own. If we take a deeper look at such a scenario, we can say that if asymptomatic cases have a generation period than symptomatic cases, R0 will be over-estimated and if they have a longer generation interval R0 will be under-estimated [4].

However, studies show that [4], if asymptomatic cases tend to resolve quickly, then asymptomatic cases may be driving a larger fraction of secondary cases than we would expect without accounting for their differences. Furthermore, the interesting fact is that some other individuals could be partially asymptomatic and still the results of R0 would be not so different from the asymptomatic case. A study conducted in Italy showed that the asymptomatic cases might also reach a higher value such as 43% [5] which further affects the validity of the reproduction number.

On the other hand, the individuals who have the potential of infecting a larger group of people compared to another normal infected person are called super-spreaders. These people tend to have a larger impact on the overall growth or decaying of the pandemic and identifying these people on time is essential to control the pandemic. The impact of these super-spreaders depends on their habits, social life, the population of his/her living area, etc. Scientists also assume that these individuals might have some phenotype that causes them to release more viruses than others. But some scientists describe this phenomenon as super spreading events rather than super spreading individuals because it feels more appropriate for the context. As an example, we can take the Kickboxing match incident that took place on March 6, 2020, which resulted in a new spike of Coronavirus cases in Thailand. Another interesting concept closely related to super spreading is the ‘20/80’ rule. It says that 20% of infected individuals are responsible for 80% of the transmissions and the remaining 80% of the infected people only cause 20% only of the total infections. Therefore, in these types of situations, we define an individual reproduction number for better representation and usually super spreading events have an exceptionally high individual R0.

Therefore, as you may understand now, silent spreaders and super spreaders could change estimates of R0 or Re drastically and these variations make it impossible to project the overall spread of disease just from R0 alone.


Even if we select the best approaches and use the data with the most accuracy for the models, we cannot simulate and forecast how might the pandemic will behave with very high accuracies. Furthermore, when we consider the factors governing every detail that we use for the predictions, we all can agree that it can only give a brief idea of how well are we doing and what will happen in the immediate future. Therefore, relying on the estimates of Re to make decisions such as when to remove lockdowns and other social distancing measures can be troublesome depending on how well the modelling process was done and depending on the accuracy of the used data.

Furthermore, we cannot have the true value of R0 until the outbreak is completely over. When we add the sensitivity of the data and the possibility of super and silent spreaders to analyse the situation, the R0 value could also be far from reality. As the solution for these issues, new models and metrics have been proposed and some scientists suggest that rather than using a mean value alone to describe the spread of disease, we could include data about the variation of R0 and Re across the chosen population with the use of detailed contact tracing studies [7]. This factor is even emphasized with super-spread cases as the R0 value mostly represents an averaged value which is highly sensitive to outliers of this nature and it lacks a strong ability to perfectly embodying the entire dynamics of an epidemic.

In the end, we can conclude that almost every model is not 100% perfect and will not always be suitable for this type of prediction. If we go with this approach, it is a matter of finding the right data, balance in the model, and the correct manoeuvring of the sections of the entire model after all.


  1. Majumder, Maimuna S et al. — Early in the epidemic: impact of preprints on global discourse about COVID-19 transmissibility — The Lancet Global Health, Volume 8, Issue 5, e627 — e630
  2. Nowcasting and R estimate: Estimation of the current development of the SARS-CoV-2 epidemic in Germany
  3. He, X., Lau, E.H.Y., Wu, P. et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat Med 26, 672–675 (2020).
  4. Sang Woo Park, Daniel M. Cornforth, Jonathan Dushoff, Joshua S. Weitz, — The time scale of asymptomatic transmission affects estimates of epidemic potential in the COVID-19 outbreak, Epidemics, Volume 31, 2020, 100392, ISSN 1755–4365,
  5. Lavezzo, E., Franchin, E., Ciavarella, C. et al. Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo’. Nature 584, 425–429 (2020).
  6. Sanche, S., Lin, Y., Xu, C., Romero-Severson, E., Hengartner, N., and Ke, R. (2020). High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerging Infectious Diseases, 26(7), 1470–1477.
  7. Pre-Print — Laurent Hébert-Dufresne, Benjamin M. Althouse, Samuel V. Scarpino, Antoine Allard — Beyond R0: Heterogeneity in secondary infections and probabilistic epidemic forecasting- medRxiv 2020.02.10.20021725; doi:



Isuru Pamuditha
Unviersity of Peradeniya : COVID Research Group

Ponder & Wander... That'll make you an interesting person || Engineering Undergraduate ||