Anyone can publish on Medium per our Policies, but we don’t fact-check every story. For more info about the coronavirus, see cdc.gov.

Covid-19 Superspreading Events Database

Koen Swinkels
Jun 12 · 9 min read

1,400+ Superspreading Events From Around the World

Latest article update: August 21. Latest database update: August 29.

Covid-19 Superspreading Events (SSEs) are events where multiple people —a minimum of 5, for example — are infected with the novel coronavirus. Those infected in turn often go on to infect others.

SSEs appear to be the key driving force behind the current pandemic, so in order to have a better chance of containing the virus it is crucial to find out whether the many SSEs that took place around the world share certain characteristics. Knowing in what types of settings SSEs typically occur, may help prevent them from happening in the future.

This database with 1,400+ SSEs from around the world may assist in this endeavour. For each SSE the following features are identified:

  • what the location of the event was
  • when the event occurred
  • how many people were infected directly and/or indirectly
  • what kind of setting the event took place in
  • what kind of activity took place
  • whether the event took place indoors or outdoors
  • whether the event occurred during flu season in that location

Database

Click here to go to the database. Please read the Notes sheet for more information about the database and its limitations.

The database as well as this article will continue to be updated with new SSEs and new information.

The project now also has a website.

Image for post
Image for post
link to database

Bubble Map with Animated Timeline

Click here to go to the SSE bubble map. Zoom in and click on a bubble to see more information about an SSE.

Image for post
Image for post
bubble map

And click here to see the same map but with an animated timeline that shows the superspreading events as they occurred in time. You can play and pause the animation, and click on a bubble to find more information about an event.

Image for post
Image for post

Please note that for most of the large SSEs in the US as well as in some other countries the index dates in the database are placeholders (in the database these cells are marked light red in the Index Date column).

The focus of the database is on the features of the settings in which SSEs took place, and on the general period in which they occurred, specifically whether they occurred in flu season or not. Finding the exact dates for more than 900 SSEs would require a disproportionate amount of time. What this does mean is that no conclusions about the spread of the virus in the US and those countries should be drawn from the timeline.

Also note that many SSEs in the database occurred over an extended period of time rather than on one specific day. Mardi Gras festivities, for example, typically are a weeks-long event, and outbreaks in prisons, nursing homes and meat processing plants also take place over an extended period of time.

Preliminary Results

  • Nearly all SSEs in the database took place indoors: the exceptions are SSEs that took place in settings with both indoor and outdoor elements, with it not being clear if transmission there occurred indoors or outdoors
  • The vast majority took place in settings where people were essentially confined together for a prolonged period (for example, nursing homes, prisons, cruise ships, worker housing)
  • The great majority of SSEs happened during flu season in that location
  • Food processing plants where temperatures are kept very low (meat, dairy, frozen foods) seem particularly vulnerable to SSEs compared to other types of factories and plants where very few SSEs occurred

Limitations

When assessing these conclusions it is important to keep in mind that the database has some severe limitations. Regarding flu season, for example, note that people tend to spend much more time indoors during flu season than in other seasons (with the possible exception of regions with very hot summers where people seek refuge in air-conditioned indoor spaces), so when it comes to SSEs flu season and time spent indoors need not be independent of each other. Vitamin D deficiency, which has been linked to an increased Covid-19 risk, will also be an effect of both spending more time indoors and the less intense sunlight outside during flu season.

In addition, the virus seems to have first emerged at the start of the flu season in much of the Northern Hemisphere and it’s only about seven months old, so for a large part of the world we simply do not yet have enough data to determine with what frequency SSEs may occur outdoors and outside of flu season as well.

Furthermore, in the past few months societies have taken all sorts of measures to try to contain the virus, so it is possible — though by no means certain — that it was some of those measures rather than warmer weather and/or people spending more time outdoors that caused the apparent slowing of the spread of the virus that we are witnessing in large parts of the Northern Hemisphere. Moreover, the settings that were closed by societies in response to the pandemic could then of course no longer give rise to any superspreading events. As a result, certain types of high-risk settings could be significantly underrepresented in the database. To take an obvious example, from the fact that there are no indoor concerts in the database it does not follow that indoor concerts do not pose a serious risk.

Also, it seems likely that there are other features that are not — or not sufficiently — investigated here that could increase the likelihood that an SSE will occur. One possible example concerns the role of aerosols in transmission. It seems worthwhile to take a closer look at the degree and types of ventilation, air conditioning and humidity control in known SSE settings, as well as at the nature of activities — singing, laughing, shouting and exercising, for example — that took place in such settings. In nursing homes, cruise ships and prisons, for example, AC systems could be a major source of transmission by circulating air with droplets that contain infectious virus particles from room to room. If these systems merely heat or cool indoor air without taking in outside air or without killing virus particles or removing them from the air, the virus can travel freely throughout a facility, no matter how closely its residents adhere to social distancing rules. A more detailed investigation of these types of SSE settings could help answer the question if this is a plausible scenario.

Selection Biases

When documenting SSEs there may also be one or more selection biases at play. For example, some types of SSEs may have been:

  • more likely to be remembered by those who were infected there than other types of events were for people who were infected in those other settings
  • more easily traceable by contact-tracing teams
  • easier to find in the literature by researchers (for example, nursing homes and prisons tend to do a lot of testing and publish the results)

These biases could make certain types of SSEs more likely to be discovered and documented than others, which in turn could steer efforts to prevent future SSEs in the wrong direction.

There may also be a confirmation bias problem once a specific SSE has been identified. The mere fact that people who now have the virus attended an event that was subsequently identified as an SSE does not mean that those people were infected at that event. Maybe infection occurred elsewhere, before or after. It is by no means always realistically possible to prove with a high degree of confidence when and where infection occurred. The higher infection rates in a community already are, the more of a problem this becomes. And there is a risk that once an event has been designated as a SSE it develops its own gravitational pull: With the SSE in mind, researchers may more easily assume that newly infected people with a direct or indirect link to the event acquired the infection there or through somebody who was there, rather than in another way. In the aggregate this could also lead to an overestimation of the role SSEs play in the pandemic in general.

Lastly, the database includes settings such as nursing homes and prisons and typically takes the cumulative number of cases. In reality these infections could have occurred over the course of several days or weeks. Moreover, some people counted in this number may have been infected outside of this setting. If the number of infections in a community is significant then the number of staff who were infected outside of this setting may also be significant. The reason these settings are nonetheless included in a database of superspreading events is that due to the nature of these facilities — residents cannot leave and residents typically have been there for a while — the large majority of residents will have been infected in the facility itself.

Incomplete and Imperfect

Note that while the goal for the database is to eventually include all SSEs found by researchers and authorities, currently that project is far from completion. Many more SSEs than just these 1,400+ have taken place around the world.

The information in the database is also by no means fully accurate or complete. Dates often had to be guesstimated based on the information available in publications about the events. And, as noted before, placeholders are sometimes used for dates, as well as for the number of cases associated with an SSE. Cells that contain placeholder data or data that needs to be checked for accuracy are marked in light red.

In addition, GPS data for some of the American SSEs may not be accurate as a bulk conversion method was used that may not be 100% accurate. Here too, doing this manually for more than 900 US SSEs would have taken a disproportionate amount of time.

Also note that for some SSEs only the number of infections at the initial event is included while for others more detailed information about secondary or even tertiary infections was available (via this smaller database, for example) and included in the total number of infections associated with an event. This obviously makes direct quantitative comparisons between SSEs problematic.

Lastly,whenever there are differing estimates of the number of cases associated with an SSE the database always uses the lowest number.

So there are a lot of limitations to the database and the information it contains, and it is very much imperfect and a work in progress. Please do not assume it is a representative sample of SSEs and please do not draw hasty conclusions from the data.

Feedback and Help

Any help (corrections, additions, suggestions) would be much appreciated. Send your information or create a copy of this database, paste your additional information or your corrections, give those cells a blue fill-in color, and send a link to your sheet to info@superspreadingdatabase.com. Assistance with improving the visualization (for example, to make it look more like this) is also welcome.

The database could also benefit from institutional affiliation, for both distribution and respectability purposes. If you are part of a research group that can help in this regard, please send an email to info@superspreadingdatabase.com.

If you'd rather go it alone, the database and all data in it may be freely used by anyone, in whatever way. A link to this article is always appreciated.

How to cite the database

Swinkels, K. (2020). COVID-19 Superspreading Events Around the World [Google Sheet]. Retrieved from https://docs.google.com/spreadsheets/d/1c9jwMyT1lw2P0d6SDTno6nHLGMtpheO9xJyGHgdBoco/edit?usp=sharing

References

Most of the SSEs in this database come from the following sources:


Support

Patreon, PayPal & Bitcoin: 1JHyK1eM7jvKPVrU7UFX8hbb2Xea9egrAP

More Articles

Koen Swinkels

Written by

https://www.twitter.com/KoenSwinkels ::: ForeignPolicyFollies.blogspot.com ::: PhilosophyOfBitcoin.com

Koen Swinkels

Written by

https://www.twitter.com/KoenSwinkels ::: ForeignPolicyFollies.blogspot.com ::: PhilosophyOfBitcoin.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store