SEIRD model of COVID-19

11 min readMar 15, 2020

This article uses two separate models. First, I use a simple SEIR model for attempting to model COVID-19. This is not well-calibrated and ends up over-predicting the basic reproduction number, but is used for some general insight. I call it an SEIRD model because I add deaths to recoveries, but that’s typically done anyway. Then I implement Alison Hill’s model, which is more sophisticated. That is potentially more realistic.

SEIRD Model

A relatively simple model for epidemic growth is the SEIR model. This model uses first order time-independent linear equations to model how patients progress through four steps of a disease, plus I added a fifth possible step:

susceptible (S): not yet infected
exposed (E): infected but not yet infectious
infected (I): infected, and able to infect others
recovered (R): no longer sick, and immune
dead (D): killed by the disease

The model assumes for each infected person, there is a chance that each susceptible person will become exposed. For each exposed person, there is a chance they become infected. There’s a chance each infected person becomes either recovered or dead. The fractional chance they die is the case mortality rate m.

So here are the following extremely simple differential equations:

∂S/∂t = –β S I
∂E/∂t = β S I – σ E
∂I/∂t = σ E – γ I
∂R/∂t = γ I (1 – m)
∂D/∂t = γ I m

So I just need to assign some numbers.

COVID-19

For m, I pick m = 0.01, which is in the ballpark. South Korea, with extensive testing, reports 0.7%, although the number is higher when there is a crisis in the health care system, as in the Lombardia region of Italy and Wuhan, China. So 1% seems reasonable.

σ is the inverse incubation time. Mean incubation is reported at between 5 and 6 days, so I set σ = 0.2/day. However, for the actual disease, individuals are reported to be infectious 2–3 days before symptoms, so this model is over-simplified.

I have revised this parameter since I first posted this model. γ sets the time with symptoms to resolution (recovery or death). The same reference reports this is from 2 to 6 weeks, which if I pick an average of four weeks, that makes γ = 1/28 days. If patients are isolated in the hospital, or self-quarantine, this number could be reduced. However, for this simple model I’m assuming that patients remain sick throughout the duration of their illness. This is an over-simplification: it is reported people are far less infectious 10 days after symptoms begin.

That leaves β, which is more complicated. I tuned β = 0.32 / day to give an exponential rate of increase during exponential growth of +15% per day, which is a doubling time of approximately 5 days. This is consistent with what’s observed outside of China.

So that’s it. If you believe the model, there’s good numbers for each parameter based on reference data I was able to find.

Basic reproduction ratio

The basic reproduction ratio is the ratio, in steady-state, between the rates at which people are becoming infected and the rate they infect others. Steady-state requires people be added at a steady rate to the susceptible pool. It is then simple to calculate that the basic reproduction ratio,

R₀ = β / γ

with my parameters, R₀ = 9.0. This is higher than most references, for example estimates that it is between 2 and 3, so my predictions about how much the exposure rate needs to be reduced are probably extreme. Exponential growth becomes exponential decay when R₀ drops below 1.

Running the Simulation

I coded the model in Perl, using an integration scheme where for each time step, I estimated the change for each parameter, applied half of that change to the parameter as an average value for the time interval, then recalculated the changes. This is better than a simple forward integration scheme.

For the initial condition I assume 1 out of 7 billion people is exposed at time 0. The rest are susceptible. Nobody as yet is either infected, recovered, or dead.

Here’s a result:

Peak exposures happen at day 159. Peak infections occur near 171. These peak out because the number of susceptible people precipitously drops: the fuel for the fire is becoming depleted. From this point, the rate of decrease of susceptible people drops, until it plateaus at 0.013%. So essentially everyone is infected and of these, 1% die and 99% recover.

This is obviously horrific, but the model makes the grossly simplified assumption that there is zero containment, that everyone intermingles equally, and nobody is quarantined when they become infectious. Of course quarantining requires testing, and the US isn’t testing much yet.

This plot shows some interesting features. On the ramp-up, exposed and infected populations are fairly close. Even though the infection period is longer, because of the exponentially increasing number of exposures, close to half do not yet show symptoms. After the peak, the number of exposed drops off relatively rapidly, but the infected population takes longer to drop, since people may remain sick much longer than they the incubation period.

reducing the exposure rate

Here I varied the scaled exposure rate, where 1 in this plot corresponds to the 5-day doubling time observed in the aggregate outside of China. I plot the final number of susceptible patients, recovered, and dead (assuming 1% case mortality rate).

If the exposure rate can reduced below 12% of normal, the disease fails to spread. This was surprising to me, but it corresponds to a case where people recover faster than they can infect others, and thus the number infected tend to decrease.

From 12% relative exposure rate up to around a 30% exposure rate, a significant fraction of people avoid getting infected. At 20%, it’s 28% avoid infection, while at 30%, 10% avoid infection.

Above 30%, pretty much everyone under this model gets infected. The death rate saturates at 1% of total population. So in this case reducing exposure rate helps only in that it helps spread out the infections over time.

exposure rate and peak infections

When people become exposed, they become infected an average of 5 days later (under the assumptions of the model), then are infectious for 4 weeks on average (under the assumptions of the model). Then they recover or die. So the more the rate of exposure can be slowed, the fewer the number of people are infected at the epidemic peak, since people infected earlier have time to recover before new people are infected to replace them.

Here’s a plot of the peak exposure and infected fractions, as a function of the normalized exposure rate. With the assumption of this model, you want the peak fraction of infected multiplied by the fraction of infected requiring hospital care to be less than the residual hospital capacity (above the baseline level of care for other conditions).

For example, with no interventions, 48% of the population would be infected at peak. If the rate of exposure were reduced 80%, to 0.2 on the x-axis, then only 10% of the population would be infected at the peak.

At and below12% there are no data because there is no peak: the exposure rate is too small and people recover faster than they infect others. The disease cannot grow.

Of course these numbers are from this model and may not accurately apply to COVID-19, since the model is a simple one.

exposure rate and epidemic duration

When you “flatten the curve” by reducing the exposure rate, using a combination of social isolation and testing and quarantining infectious patients, then the epidemic lasts longer. This is illustrated in the following plot from the model, showing for each normalized exposure rate, how long the number of exposed and the number of infected are at at least half of their peak value for that exposure rate:

So if you reduce the exposure rate, that means that people will be sick for a longer period. This spreads out the load on the health care system, but it also increases the duration of economic impacts. At this point the question becomes: are you more interested in saving lives or accelerating economic activity? I think if the answer is the latter there’s something very, very wrong with our social infrastructure, and you need to revisit your priorities.

tuning for R₀

The key model parameters are the doubling time, which is the time during the exponential growth phase it takes the infection to double, and the basic reproduction number R₀, which describes how many people are infected by each infected individual.

Tuning parameters for this are β, which is the exposure rate, and γ, which is the rate at which people recover (or in this model, die). The recovery rate γ is the inverse of the mean time people spend infectious. In what precedes this, I assume this time is 28 days, which is reported as the time people remain sick. However, if people can be quarantined while still sick, this time can be reduced. In this case, once a patient is quarantined that patient is effectively either recovered or dead. The delay until the actual event is essentially irrelevant, for modeling purposes.

This plot shows the effect of varying γ and β on R₀ and the doubling time. Thick black lines show the effect of varying β for a given γ, while the thin grey lines tie together points of the same β.

The green box shows ranges for these parameters consistent with what I have seen in references. In particular, R₀ is estimated at between 2 and 3. The curve for a γ of 1 / 28 days falls outside this box: if I tune to the desired doubling rate, then the R₀ is too large. To fall within the green box, I need to increase γ (reducing the infectious period), and increase β (the exposure rate). To fall within the box the mean infectious time should be approximately 4 days rather than 28 days. Then to get a doubling time of 5 days β should be approximately doubled, to 0.7.

This results in the following over time:

Here R₀ = 2.8 which is much less than the 9 in the original parameter set, but people are only infectious for only 4 days. Now 8% of the population remains uninfected, the infected (actually infectious) cases drop more rapidly after the peak, and the peak number is less. This is misleading, however, since I am assuming some of the “recovered” population is actually in the hospital getting past the natural infectious period. However, the important thing is the lower R₀ means less social isolation will be needed to reverse exponential growth.

In hindsight, this result is obvious. If you want a greater rate of increase, you need to reduce the time lag between getting infected and infecting others. This can happen either by reducing the incubation time (this model falsely assumes people are not infectious during the full incubation time), or by increasing the infection rate (β) once infectious. But to keep the reproduction ratio down, if the number infected is fixed, then when β increases, the rate of recovery γ needs to be increased. A primary error here was the assumption of a mean incubation time of 5 days without being infectious. Another error was assuming people are infectious throughout the duration of their illness.

A model needs to be properly tuned to be quantitatively valuable. However, with rough tuning it can still provide guidance and insight.

enhanced model (Alison Hill)

Allison Hill has an improved form of the SEIR model: you might call it an SEIIIRD model. A web interface is here. With this model there’s three stages of infection: mild, moderate, and severe. Moderate and severely infected are hospitalized (if there’s capacity) and removed from the infectious pool. To the point of mild infection our models are similar. But since I assume patients are infecting infecting others until recovered or dead, which lasts on average 28 days.

This model addresses the issue that my assumption of being highly infectious for 28 days resulted in an overestimate of R₀. With this model, patients can spend a majority of that 28 days sick in bed not infecting others.

Here’s the equations in the Alison Hill model:

∂S/∂t = –(β₁ I₁+β₂ I₂+β₃ I₃) S
∂E/∂t = (β₁ I₁+β₂ I₂+β₃ I₃) S – σ E
∂I₁/∂t = σ E – (γ₁+p₁) I₁
∂I₂/∂t = p₁ I₁ – (γ₂+p₂) I₂
∂I₃/∂t = p₂ I₂– (γ₃+μ) I₃
∂R/∂t = γ₁ I₁+γ₂ I₂+γ₃ I₃
∂D/∂t = μ I₃

where instead of a single infectious state, there’s three states:
I₁: mild infection
I₂: moderate infection
I₃: serious infection
From each stage of infection, patients either recover or pass to the next stage. At the final stage, they recover or die.

calibrating the Alison Hill model

The Alison Hill model has some default calibration settings. However, these result in a relatively high doubling time of 9 days. So I tuned parameters as follows. I note where I changed parameters from Alison’s defaults.

β₁ = 0.435 (set for +15% per day, approximate 5-day doubling)
β₂ = 0
β₃ = 0
σ = 0.4 (contagiousness begins 2.5 days before symptoms)
γ₁ = 0.133
γ₂ = 0.188
γ₃ = 0.060
μ = 0.01 (tuned for net 1% case mortality)

The basic reproduction number R₀ came out to 2.62, which is close to the literature values.

interventions with the Alison Hill model

I used this model to examine the effect of interventions. With interventions, I assume the coefficient β₁ is reduced from 0.435 to 0.10 due to social isolation, extensive testing and quarantine, scanning people for fevers, etc (although by the time people have fevers, they have already been infectious for days).

Here’s how the interventions affect the infections and death rate, after implemented. Infections here include pre-symptomatic cases which are now infectious, but it also includes people who are sick but no longer infectious. That said, there’s a lag of several days between the intervention and a reduction of infected people. There’s a more substantial lag before the death rate decreases.

But I assume these interventions are only temporary, so are relaxed after some time. Here’s the result for 30, 60, and 90 day interventions:

The result is the interventions cause the infection to soon switch from exponential growth to attenuation. But obviously, once the intervention is relaxed, the infection grows again. The exception would be if the virus could be completely eliminated.

The lesson from this exercise is interventions must continue until either a vaccine is available, or the symptoms of the disease can be treated without overtaxing the medical system.

disclaimer

Of course this model is extremely simple, and in reality people’s networking is far more complex. In reality, β is different for different people, and differs for that person whether the infected people are nearby or remote. More complex networks can be simulated, with randomized statistical distributions of parameters, for example Γ or exponential distribution of parameters, but I’m not an epidemiologist so this is beyond my expertise.