Querying VAERS

Derek Li
5 min readSep 10, 2021

--

CORRECTION: The original post stated that the data set contains VAERS reports processed as of October 23rd, 2021, the correct date is August 23rd, 2021, it has been since corrected. (9/10/2021)

Part 2 of “Querying VAERS” can be found here.

What is VAERS?

Established in 1990, the Vaccine Adverse Event Reporting System (VAERS) is a national early warning system to detect possible safety problems in U.S.-licensed vaccines. VAERS is co-managed by the Centers for Disease Control and Prevention (CDC) and the U.S. Food and Drug Administration (FDA). VAERS accepts and analyzes reports of adverse events (possible side effects) after a person has received a vaccination. Anyone can report an adverse event to VAERS. Healthcare professionals are required to report certain adverse events and vaccine manufacturers are required to report all adverse events that come to their attention.

The HHS also publishes a Guide to Interpreting VAERS Data:

When evaluating data from VAERS, it is important to note that for any reported event, no cause-and-effect relationship has been established. Reports of all possible associations between vaccines and adverse events (possible side effects) are filed in VAERS. Therefore, VAERS collects data on any adverse event following vaccination, be it coincidental or truly caused by a vaccine. The report of an adverse event to VAERS is not documentation that a vaccine caused the event.

It also reminds people to keep in mind the following limitations:

“Underreporting” is one of the main limitations of passive surveillance systems, including VAERS. The term, underreporting refers to the fact that VAERS receives reports for only a small fraction of actual adverse events.

To query data in the VAERS system, one can do so interactively using CDC’s VAERS WONDER system. However, I found it not only cumbersome to use, but also extremely limited in its capability.

Instead, the raw VAERS data set is made available in CSV format, at the time of writing the latest data was published on September 3rd, 2021, and contains VAERS reports processed as of August 23rd, 2021.

I downloaded the All Years Data and uploaded the files to an Azure storage account, there are 3 tables (yyyyVAERSDATA.csv, yyyyVAERSVAX.csv, and yyyyVAERSSYMPTOMS.csv) per year, the flat file schemas can be found in the VAERS User Guide.

These files are uploaded to their corresponding container so they can be later ingested.

To query the data I’m using Azure Data Explorer (aka. Kusto), a manual ingestion of data is very easy from blob containers.

Knowing the limitations of VAERS data, what does it tell us?

First we can take a look at reported cases by age groups.

Next, case count by sex. 345k of the reported cases are female, 135k are male, 34k other.

Same data, but over time:

We can also look at case count by vaccinated date and VAERS case received date. Cases are usually received not long after the vaccinated date, there’s a jump in number of received cases in August that looks like an anomaly worth further investigation.

Looking at the average time it takes and 50th, 90th, and 99th percentiles between vaccination and case receive date

Looking at case count by where the vaccine was administered, it looks like almost all the cases in the August anomaly are from “Unknown”.

There’s also data on whether the adverse event required doctor’s office/emergency room visit:

Looking at whether the vaccine recipient recovered from the adverse event, it looks like there are equal amount of recipients recovered vs. no recovered. August data still shows anomaly — more unknown and recovered cases compared to not recovered like previous months.

There are also several other columns we can look at, such as whether the adverse event was life threatening, required hospitalization, prolonged existing hospitalization, caused disability, or death.

The VAERSSYMPTOMS table allows us to look at symptoms of these adverse events:

Next steps? There’s a lot more data in the set worth exploring, to slice and dice, and particularly the August anomaly. It would also be great to join it against other data set, such as general vaccination administration data to gain insights.

That is what the data says, the conclusion to draw will be up to you.

--

--