Are spam donations distorting how we see presidential candidates?

Raising money for office has never been a more grassroots process. Since Bernie Sanders’ famous run in 2016, the rules for campaign fundraising (at least on the democrat side) have been turned on its head. With a campaign that attracted almost 5 million donations, famously averaging $27 each, Sanders set a powerful benchmark that other candidates have made great efforts to meet in 2020. Average donations have now become a common heuristic to judge the grassroots viability of a candidate; it is one of the first metrics everyone wants to know when fundraising numbers are revealed, and it is used in fundraising emails and Facebook ads to supporters.

An example of the average donation as a marketing tool in the wild

This single metric is incredibly salient in the discussions around the primary. But it also presents a lot of problems. For example, we have to take a candidate’s word on what the average is. While campaigns are legally obligated to keep track of every donation, they are only obligated to publicly disclose donations if a donor gives more than $200 (as far as I know, no candidate reported many donors under the $200 threshold except for Andrew Yang). If a campaign is trying to attract the kind of grassroots support that would lead to a very low average donation, it follows that most of its donors will not give more than $200.

This gap in transparency can cause controversy. When Beto O’Rourke announced his candidacy, he had an incredible first day of fundraising that brought in over 6.1 million dollars. The question naturally arose around whether these were high dollar donations, or low dollar evidence of grassroots support; everyone looked to the average donation size. At first, O’Rourke only announced the number of donations he received. In certain corners of the internet, this was taken with extreme cynicism. I saw several accusations claiming O’Rourke hid the number of donors to obscure the fact that the campaign had gotten a handful of donors to make hundreds, or even thousands of $1 donations. These accusations were made completely without evidence, and turned out to be untrue (it turns out Beto manipulated his numbers in an entirely different way). But the underlying charge interested me. If a candidate’s average was being distorted, on purpose or not, how could we find out?

The proportion of itemized to non itemized donations each candidate received, as a whole. (Source: FEC)

The FEC reports for the first quarter have come in, and this gives us our first chance to scrutinize the financial disposition of each candidate. As I mentioned before, every person who donates over $200 in a single cycle, and each of the donations they made to reach that total, are disclosed to the public. The limitations this puts on us are obvious; if a person made 199 $1 donations to a candidate, that would definitely affect the average donation, but we would have no way to know if it happened. And since many candidates are getting well over half of their funding from non-itemized donors, the conclusions we can draw here are very limited. However, since itemized donors tend to be wealthier (because they can afford to spend more than $200 on a candidate), it seems intuitive that they are less likely to split their donations up into very small increments. If we see a significant number of itemized donors making an unexpectedly large number of small donations, that can be a sign that we need to investigate further.

I took the fundraising disclosures for each of the major candidates (and a few others, because why not?) and looked for sketchy behavior. I’ll show you examples from Beto O’Rourke’s disclosure here, but you can see the data for yourself here. First I graphed the number of donations made at each dollar quantity under $7.50. The dataset only includes high dollar donations, so we can discount the possibility these donations are from folks who are chipping in a couple dollars to get the candidate into the debates, or just throwing in $5 and moving on. Every donor here is committed to the campaign. However, there is some noise in this metric since a lot of fundraising emails give have a default option to donate $3 or $5 dollars, and we should expect some donors to give at this level a few extra times. It looks like we can see a few examples of this in Beto’s data, but we can also firmly reject the earlier conspiracy I mentioned. There simply aren’t enough $1 donations to make a difference.

Beto maxes out at 180 $5 donations, not quite enough to fundamentally shift his average

Next, I looked at how many contributions each donor made. If single donors are making hundreds of small donations, that will heavily influence the average. However, Beto again looks clean here.

While this is all publicly available information, I didn’t want to publish identifying information. I assigned a random number to each unique donor here so we can still see the data.

Then, I took a look at when these small donations came in, to see if there were any patterns (like increased activity on the kick off of a campaign, or near then end of the FEC fundraising period).

It looks like there’s a decent bump on Beto’s campaign launch, but it would take way more than that to truly distort his average donation.
Beto’s true and adjusted averages turn out to be practically identical

These three graphs give us the ability to use our intuition and make educated guesses about what’s going on. However, I wanted a metric that is less subjective than interpreting some graphs visually. So I decided to look at how the average donation is affected by these smaller donations. I took the average of all the donations in my dataset, but only counted each donation an individual made under $7.50 a single time. I chose this because by the time you get to $10, the ability to move the average donation is limited by the higher donation amount, and the lower number of donations one could make before reaching the legal limit. For an example of how this works, if a person makes 20 donations to a candidate, consisting of 15 donations of $1 each, and five donations of $100 each, I will count the 15 $1 donations as only 1 donation of $15. Whereas the true average donation for this example would be $25.75 (515 / 20), the adjusted average would be $85 (515 / 6). This allows us to effectively negate the impact of spam donations.

This adjusted average allows us to compare the degree of distortion from low dollar donations on the average between campaigns. Since each candidate raised different sums and have widely varying averages, I took the ratio of the adjusted average to the true average, to get a measure of how much distortion is taking place. In the case of the example above, the skew coefficient is 3.3. It’s important to note that this measure is relative. There’s no inherent virtue in having a high or low skew; a campaign that raised money exclusively from people donating the maximum amount would have a skew of 1.0 just the same as a campaign raised money exclusively from under $200 donors. But if all else is equal, a higher skew would probably imply the average donation is not as trustworthy. Above are the numbers for Beto.

The level of skew for each candidate. One of these candidates sticks out a bit more than the others…

With this methodology in place, we can calculate the level of distortion for every candidate. And just like that, we see something potentially surprising. Bernie Sanders’ appears to have an unusually high level of distortion! Lets take a closer look at his data:

These graphs look nothing like the other candidates’

Bernie has a dozen or two donors who have donated over 50 times. We see thousands of extra donations from high dollar donors, with particular spikes at the$1 and $3 levels. It seems especially unusual that anyone would donate this many times seeing as the Sanders Campaign had only been active for about 6 weeks. In addition, these apparent spam donations are overwhelmingly centered around the last couple of days of the fundraising cycle. I took a closer look at a few of the most prolific donors. One person for example made 109 donations of $1 each, and then three more of $3 a piece on the 31st of March. Another person did around 20 $3 donations per day for the week leading up to the 31st.

I am not accusing the Sanders campaign of fudging their numbers. There are a lot of other factors at play. Bernie puts heavy emphasis on his average donation. His supporters know the value of that metric. I was surprised at the comparatively high spike of $5 donations. If I was cynically trying to drive down the average, I wouldn’t do it with $5 donations, since that is way less efficient. This could be powerful proof that Bernie’s fundraising team is excellent at convincing existing donors to keep chipping in, with a well timed email or text. This theory is slightly contradicted by the intense spike of donations over a short period of time, but I definitely believe that accounts for some portion of it.

Another factor to consider, is that Bernie’s messaging around March 31st focused on reaching 1 million donations. Not donors, but donations. This was a savvy move because it pushes existing donors to donate additional times. I can understand how the combined importance of the low donation average and the drive to 1 million donations could have had an untintended stochastic effect on some of Bernie’s supporters. There may even be something commendable about being an inspiring enough candidate that not one, but several people will go through the presumably immense effort to donate 50+ times in one day.

One final caveat to the above: due to the limitations of the data set, I only have 16% of Bernie’s donations to interpret. The Sanders campaign reported receiving over 900k donations from 450k donors, and out of all that, only 10,000 donors gave over $200 and were publicly disclosed. My sample size is only 28k donations, so no one can honestly discount a candidate’s average donation based on this analysis alone. It simply raises questions about an important metric that is playing a key role in the 2020 cycle.

In conclusion, using the average donation from a campaign to determine its grassroots appeal is an inherently flawed approach. It reminds me of a few organizers I have spoken with on field campaigns. There is a tendency to focus on larger numbers since they feel more impressive; many campaigns will brag about the numbers of doors they knock instead of the number of people they have canvassed. Campaigns interested in giving an honest picture of their support would do well to focus on more direct measures. These include the number of overall donors (even though some distortion of the average exists, Bernie Sanders’ 450k donors is still a gargantuan achievement), volunteer shifts filled, voters contacted, or even poll numbers. It should also be clear that ActBlue has a responsibility to limit spamlike behavior on its platform. The unprecedented ability to raise money online that ActBlue provides is critically important, but it also enables these shenanigans. Its incredibly unlikely a fundraiser would accept 50 $1 checks in the mail, and no way a fundraiser at an event would accept someone splitting their donations into hundreds of parts. Platforms ought to step up and ensure their platforms aren’t abused this way.

And the FEC really should lower the disclosure limit to $50 or something, doing analysis on a 3rd of a campaign’s budget is a real pain.

(If you’re on a campaign and you are looking to hire someone who can interpret and visualize campaign data like this, or any other position really, check out my resume here!)