Bernie spends the most money, creative agencies are the real winners of the 2016 presidential election, and other fun discoveries from campaign expenditure data

In early 2015, I had the chance to pilot a part-time data analytics course in Chicago. A year and a half later, I had the chance to enroll in the same course in San Francisco. For ten weeks, I spent Monday and Wednesday nights after work digging into large datasets and excavating insights. I learned that Excel has more functionalities than I ever imagined. That five lines of SQL can give you answers that would take half an hour to figure out in Excel. That telling a simple, compelling story with data is much harder than using either Excel or SQL.

The story I ultimately chose to tell for my final project is about the 2016 U.S. presidential election. Between Nate Silver and New York Times and Fox News and college classmates’ Facebook posts, there is a nauseating amount of analysis for seemingly every presidential election. I picked something that I think hasn’t been under the spotlight as much: campaign expenditures.

The Hillary-Bernie rivalry has fueled heated conversations about where candidates get their campaign money from. But at least in my circles, there doesn’t seem to be much discussion about how candidates spend that money. As someone who previously had to manage a P&L like a hawk, I was curious to know: Are our presidential candidates (or rather, their campaign staff) good budget managers? How much do they spend? What do they spend it on? Is there a correlation between how much a candidate spends and their popularity in the polls?

My analysis is based on the Federal Election Commission’s 2016 presidential campaign expenditure data. Essentially, it’s a huuuge spreadsheet where every row represents an instance of spending, with information about which candidate spent it, the vendor they gave the money to, the city and state the vendor is in, the amount of money spent, a description of the spending, and miscellaneous administrative data. So one row might read something like: Rubio, Marco | American Airlines | $383.6 | 11/19/15 | Fort Worth | TX | Airfare (by the way, Marco Rubio’s staff took American Airlines a lot).

When I did my analysis in July 2016, there were about 200,000 rows in this spreadsheet, spanning from June 2014 through June 2016. It’s a phenomenal dataset and I’m thrilled there’s this level of transparency into the campaigns of our possible future presidents. It’s basically like a gigantic business expense report, with every candidate’s spendings included.

Except it’s also like a gigantic business expense report that’s really lazy. All datasets have limitations; I thought I’d point out the ones that this particular dataset has upfront. To start, there is no consistency across the spending descriptions. So Marco Rubio might label an airfare purchase “Airfare,” but another candidate — or the same candidate on a different day—might label it something else. Ben Carson, for one, prefers “Travel.” He’s a simple guy like that. Jim Gilmore is all about “Mileage.” (Right now if you’re thinking, “Who is Jim Gilmore!!??” I don’t blame you. He was one of the many, many Republican candidates in this election cycle. He dropped out of the race in February 2016. He used to be the governor of Virginia. You’re welcome for that trivia knowledge.) Because of this inconsistency, I actually went through each expenditure and manually tagged them under broader categories that I defined, such as “payroll” or “T&E” (travel & entertainment — AKA flights, hotels, meals, etc.).

The other big limitation with this dataset is it’s heavily reliant on honest, accurate reporting from candidates, a point I’ll refer back to later on. I assume if a candidate’s staff “accidentally” categorized an expenditure as not campaign related, it wouldn’t show up in this publicly-accessible spreadsheet. All this is kind of like if your company let each department report expenses however they’d like, without needing to attach receipts.

Despite these limitations, I was impressed that we the people have this level of access to what exactly our presidential candidates are doing with their campaign money. So with a laptop and a seat at a Berkeley coffeeshop, I got to work digging into this data. For the sake of this project, I focused primarily on the 17 candidates who were still in the race as of January 2016:

  • Democrats (3): Hillary Clinton, Bernie Sanders, Martin O’Malley
  • Republicans (12): Donald Trump, Ted Cruz, Marco Rubio, John Kasich, Ben Carson, Jeb Bush, Chris Christie, Rick Santorum, Rand Paul, Mike Huckabee, Carly Fiorina, Jim Gilmore
  • Third-party (2): Gary Johnson (Libertarian), Jill Stein (Green)

After about ten hours of analysis using primarily Excel and SQL, here are my biggest discoveries.


How much do candidates spend?

$840,000,000

As of June 30th, 2016, a total of $840 million have already been spent in this presidential election. To put that in perspective, that is enough money to let 28,000 people survive without pay for a full year in the U.S. That is enough money to help over 6,000 people afford down payment on a house in the Bay Area today. Oh, and that is also roughly enough money to buy the Chicago Cubs.

Clinton and Sanders have spent by far the most

Hillary Clinton has spent a total of $226.8 million in this election so far; Bernie Sanders has spent $222.6 million. Combined, they account for over half (55%) of all 17 candidates’ expenditures in this election to date. Each of them has spent about 3 times what Trump has spent, and about 7.5 times what the average Republican candidate has spent.

Total campaign spendings to date

This trend holds even when I normalized for the length of each candidate’s campaign. If I divide each campaign’s total spendings by how many months the campaign was running (to account for the fact that some candidates dropped out of the race early), Sanders and Clinton still top the list in terms of spendings, with each of them spending 2–3 times what Trump has spent on a monthly basis and 5–6 times what the average Republican candidate has spent on a monthly basis. Sanders has actually BERNed (sorry) through the highest amount on average per month — which surprised me given his “one of the people” image.

Average monthly spendings

For being the Republican frontrunner, Donald Trump has spent relatively very little: $5.4 million on an average monthly basis and $76 million total to date, which is less than what even Ted Cruz has spent. Either The Donald is excellent at spending wisely, or excellent at reporting inaccurately. Or he’s been amazing at playing the media. Here’s an explanation in Trump’s own words, from his interview with Fox News in October 2015: “I’ve spent zero on advertising because you and Fox and all of the others, I won’t mention names, but every other network, I mean they cover me a lot, to put it mildly. And in covering me, it’s almost like if I put ads in on top of the program, it would be too much. It would be too much Trump.” Yup.

Comparing the frontrunners: Clinton outspends Trump in almost every category

I mentioned earlier that Clinton has spent a ton more money overall than Trump has. If we break this down into categories of spending, we see that she has outspent him in pretty much every single category except services (where she has spent half of what he has spent). The category where the two frontrunners have the biggest expenditure difference is in payroll, where Clinton has spent about 15 times what Trump has spent.

Relative spending amount: Clinton versus Trump (this chart shows how much more Clinton has spent relative to Trump in each category — for example, she’s spent 3.9 times what Trump has spent on remote marketing, and 15.1 times what Trump has spent on payroll)

Spending amount is actually kind of correlated with popularity

More work is needed to say there is a statistically significant correlation, but a first pass shows that there is a general positive relationship between amount of spending and popularity. In the chart below, the X-axis shows the average monthly spending for each candidate. The Y-axis shows the relative ranking of each candidate during the primary season (i.e. against candidates from their own party) as of early 2016. You can see that the trend is up and to the right, meaning that generally speaking, a candidate who spends more also ranks higher in the primary polls.

Relationship between spending amount and popularity

This could be for a variety of reasons. It’s possible that spending more money actually helps drive up popularity. It’s also possible, though, that candidates who are already more popular tend to have more campaign money and therefore can afford to spend more. Correlation does not equal causation.

What do candidates spend money on?

Comparing the frontrunners: remote marketing is the biggest expenditure for both Clinton and Trump, but other priorities differ

The biggest line item for both Clinton and Trump is remote marketing — basically all non-event methods of marketing, including advertising, phone banking, fliers, etc. About half (48%) of Clinton’s total expenditures and 37% of Trump’s have been in this category. This is probably not shocking, given how important marketing is to acquiring users ahem voters.

How each frontrunner is distributing their expenditures across categories

Beyond remote marketing, spending priorities differ between our two frontrunners. Clinton’s second biggest line item is payroll (22.5% of her total expenditures), versus Trump whose second biggest spending category is services (22.4%), which includes things like political consulting and accounting services.

Creative agencies are the real winners of this election

Financially, creative agencies (advertising, communications, digital, marketing agencies) have benefited extremely handsomely from this election cycle so far. Of the 10 vendors that have received the largest amount of money from all candidates, 9 of them are creative agencies. 6 of Clinton’s top 10 vendors also fall in this category, as do 4 of Trump’s top 10 vendors. The biggest winner so far in this entire election cycle is Old Towne Media, a mysterious ad agency with bizarrely little online presence, which has received $83 million to date from the Bernie Sanders campaign.

The campaigns are treating private jet providers pretty well too. At $9 million, Executive Fliteways is Clinton’s 4th most expensive vendor. Tag Air has LANDED (sorry, again) the #3 spot in Trump’s list of top vendors, receiving $5 million in this election so far.

Also, Trump’s campaign really likes making custom branded apparel and headwear. 2 of Trump’s top 10 vendors (Ace Specialities and Cali-Fame) fall in this category, totaling about $7 million in spending so far. Maybe those “Make America Great Again” hats really are making the custom apparel industry (financially) great again.


Who knew digging into other people’s spendings could be so much fun? To wrap up:

  • Clinton and Sanders have spent the most money, by far.
  • Trump’s campaign is fantastic at either managing expenses or fudging expenses. They also really like making branded apparel.
  • Amount of spending is actually correlated with popularity.
  • Remote marketing is both presidential frontrunners’ most expensive category of spending.
  • This election has been especially financially rewarding for creative agencies.

A few recommendations I’d like to make based on this project:

  • We really need to standardize and tighten how candidates report their expenditures. The current process leaves too much room for inconsistency and inaccuracy.
  • If a campaign manager wants to cut costs, look into less expensive options for remote marketing, since that is currently the campaigns’ biggest line item.
  • If a creative agency (or private jet provider) wants to increase revenue, marketing to political candidates could be a good idea.

And some areas for further exploration:

  • Why is Trump spending so much money (proportionally) on services?
  • What incentives exist today for candidates to report something as a campaign expense versus a non-campaign expense? This will help us understand why and whether a candidate might underreport or overreport campaign spendings.
  • How does spending amount relate to campaign budget? I would imagine how much money a campaign has impacts how much money that campaign spends.
Gems from the dataset
  • Why is Bernie paying people with ice cream and how common is it for political candidates to pay people “in kind” in general? I’m just imagining all those idealistic, bright-eyed interns working at campaign offices across the country, being paid in nothing but ice cream and office supplies and rent…
Yea I know this guy isn’t in the 2016 race but you know you wish he were

Finally, since this is my very first data analytics project of this scope where I got to pick the subject, I thought I’d share a few reflections on the process:

  • Writing Excel functions and SQL queries and making charts are really not the hard part of data analytics. They’re certainly not easy either, but what’s much more challenging is approaching a complex issue in the right way (or with the right “analytical framework,” as my instructor would say) and then telling a simple yet powerful story about the results. Learning to do that better is what I’ve enjoyed the most about my class. It is very much still a work in progress, and if you’ve read through this whole post and have feedback on how I can tell this data story better, I’m all ears.
  • Excel and SQL might not be the hardest part of data analytics, but they’re pretty darn hard to remember over the long haul in the absence of continual practice. For that reason, I’m challenging myself to take on a data analytics project on a regular basis. This election expenditure project was my first foray. There will be more to come. Can’t let those newly-built Excel and SQL muscles go flabby.
  • She who analyzes the data, has the power. I am blown away by how much sway the analyst has over major decisions and opinions. Most of the world population are reliant on other people to make sense of data for them. This is especially true when we’re talking about the increasingly massive datasets at our disposal, where tools like Excel and SQL and Python become crucial for sense-making and where the number of people who can do the sense-making becomes ever smaller. But as we saw earlier, datasets have flaws. They’re often not entirely accurate. They are vulnerable to individual analysts’ personal biases. Their results are dependent on what analytical framework gets chosen. All this makes it increasingly important for the general populace to have a foundational knowledge of how data works, and for all of us to be doubly critical when we are presented with data-driven results. It is simply too easy to lie with statistics.