Economic Anxiety and the Limits of Data Journalism

There is an ongoing battle among the liberal intelligentsia over “economic anxiety.” The basic question is whether economic factors — loss of manufacturing jobs, decline in living standards, increase in insecurity — are a valid explanation for the rise of Trump. To simplify, one side claims that economic anxiety is one reason, along with racism (and sexism, and anti-Semitism, and …), for Trump’s popularity; the other side claims that the economic argument is wrong, and the Trump phenomenon is all about racism (and sexism, and anti-Semitism, and …). For an evocative example of the former, see Chris Arnade here at Medium; this post is mainly about the latter.

This debate has reached its cultural apogee with the genre of the economic anxiety tweet, which features a racist, sexist, anti-Semitic, or otherwise reprehensible Trump supporter, accompanied by a sarcastic comment about the supporter’s “economic anxiety.” Here are some recent examples:

Why this particular debate has become so bitter has been lost to history. Probably the economic anxiety deniers think that explaining Trump in (partially) economic terms amounts to excusing or ignoring racism, while the economic anxiety believers think that the racism-only story ignores the erosion of the middle class over the past thirty years. This is why — since we’re all well-meaning liberals here — when not confined to 140 characters, the deniers take pains to say that we should help poor people, while the believers take equal pains to say that racism is bad.

The people thinking of the clever economic anxiety tweets are just doing it to annoy the other side; they know that one anecdote, or several dozen, doesn’t prove anything. But periodically there are attempts to disprove the economic anxiety hypothesis — with data! Dylan Matthews of Vox is the latest to take up the challenge, with a long, heavily documented, and very heated argument that the Trump phenomenon is about race, not economics. But it fails, for a simple reason: You just can’t prove what he wants to prove with the data we’ve got.

Matthews’s editor, unfortunately for him, gave his article this title:

It’s unfortunate because the article’s 2,000 words don’t contain a single quote by an actual Trump voter. (By contrast, two of the three articles he starts off by criticizing do quote Trump supporters.) At one point, Matthews placed a link on the suggestive text, “the statements of Trump supporters themselves” — but that link only goes to a tweet by Vox executive editor Matt Yglesias, which quotes exactly one person interviewed in a New York Times article:

But that’s not the problem. This is Vox, the home of data journalism, and I actually beleive in their mission, for the most part: We should try to answer questions using data, not anecdote. The way to “listen to what they’re actually saying,” according to Vox, is to look at the data. The problem, as I’ll explain in perhaps excruciating length, is that the data aren’t very good.

“There is absolutely no evidence that Trump’s supporters, either in the primary or the general election, are disproportionately poor or working class,” Matthews launches in. His first source is none other than Nate Silver, who found that Trump voters “had a median household income of $72,000, a fair bit higher than the $62,000 median household income for non-Hispanic whites in America.” But Silver’s analysis doesn’t support the point that Matthews wants to make. Trump voters made more than the median family because Republican primary voters make more than the median family. Cruz voters’ median income was $73,000, while the figure for Kasich voters was $91,000. The population from which Trump voters were drawn was . . . Republican primary voters! So if Trump drew his support disproportionately from poor as opposed to rich Republicans, you would expect his voters’ median income to be lower than that of the median Republican primary voter — which is exactly what happened. Now, the effect is relatively small — Cruz voters were only slightly richer than Trump voters — so this is not strong evidence for the economic anxiety thesis. But nor is it good evidence against that thesis, especially because of the averaging problem, which I’ll come back to in a bit.

Matthews’s second source is a “major study” by Jonathan Rothwell of Gallup. I’m not sure what makes it major, except perhaps that Rothwell had a lot of data. Here’s how Matthews describes Rothwell’s findings:

Trump support was correlated with higher, not lower, income, both among the population as a whole and among white people. Trump supporters were less likely to be unemployed or to have dropped out of the labor force. Areas with more manufacturing, or higher exposure to imports from China, were less likely to think favorably of Trump.

That sounds convincing — until you actually read Rothwell’s paper, which says this:

Higher household income predicts a greater likelihood of Trump support overall and among whites, though not among white non-Hispanic Republicans. In other words, compared to all non-supporters or even other whites, Trump supporters earn more than non-supporters, conditional on these factors, but this is partly because Republicans, in general, earn higher incomes, and the difference is no longer significant when restricted to this group. …
On the other hand, workers in blue collar occupations (defined as production, construction, installation, maintenance, and repair, or transportation) are far more likely to support Trump, as are those with less education. … Since blue collar and less educated workers have faced greater economic distress in recent years, this provides some evidence that economic hardship and lower-socio- economic status boost Trump’s popularity.

In short, Rothwell provides evidence for both sides, and Matthews only tells half the story — this time. Actually, Matthews’s original writeup of the paper cited both sets of findings, because then he wasn’t trying to destroy the economic anxiety theory.

The other problem with the Rothwell paper, which I discuss at length here, is multicollinearity. It is true that Rothwell found that income was a positive and significant predictor of Trump support, at least in the full sample. But his “controls” included employment status, “works in blue collar occupation,” union member, highest degree, share of college graduates in the region, and median income in the region, all of which are correlated with household income. For example, if blue-collar workers make less money and are more likely to support Trump, that effect could be attached to the blue-collar variable (which it was) and not to the income variable (which it wasn’t). Multicollinearity doesn’t bias your results, but it makes them much more fragile.

The general problem with arguments of the form Trump-supporters-are-actually-rich is this: compared to what? If you want to answer the question of how well Trump is doing with working-class voters, you need a baseline. You can’t expect him to poll evenly with Clinton among any group of poor people: as Matthews acknowledges, “Lower-income whites are always likelier to support Democrats than other whites.” So saying that Trump supporters are richer than Clinton supporters, or that some group of poor people favors Clinton, doesn’t prove much. And as we’ve seen, Trump voters are not rich compared to other Republican primary voters.

The most obvious baseline, although it has its problems, is Mitt Romney’s vote shares in 2012. According to exit polls of the actual election, Romney lost to Obama by about 2 percentage points overall (he actually lost by more than 3 points). Among families making less than $50,000, however, Romney lost by 22 points, so in that group he under-performed his overall average by 20 points.

The first recent Clinton-Trump poll that I could find with crosstabs was by Morning Consult from last week. In that poll, Trump loses to Clinton by 4 points (see Table v16g5) in a two-way race (for comparability with 2012). Among households making less than $50,000, he loses by 7 points. So Trump does 3 points worse among poor families than he does overall, while Romney did 20 points worse.

17 percentage points are a big difference.

(I think this is what journalists call burying the lede.)

It’s about as strong proof that Trump’s supporters are “disproportionately poor” as you could find. Also note that if Clinton is beating Trump by only 7 points among poor people, including African-Americans and Latinos, she could very well be losing among poor whites.

Eager to get to the primaries, where his data are more interesting, Matthews presents his theory of the general election:

The story is pretty simple: What’s driving support for Trump is that he is the Republican nominee, a little fewer than half of voters always vote for Republicans, and Trump is getting most of those voters.

But we can actually learn more from the data than this. Sure, Romney and Trump each get 40-something percent of the vote. But there’s a big difference in the composition of that 40-something percent. Compared to Romney, Trump does much better among poor people and much worse among families making more than $50,000.

That seems like pretty strong evidence for the economic anxiety hypothesis.

But I wouldn’t get too excited about that, either.

The deeper flaw with all of this entrail- crosstab-reading is what I referred to earlier as the averaging problem. Donald Trump is going to get tens of millions of votes on November 8. Some of those voters will be racists. Some will be poor people concerned about their economic future. Some will be poor racists concerned about their economic future. When we look at aggregate statistics about those voters, we can get a sense for the average preferences of Trump voters, but that average mixes together a wide range of motivations.

I don’t mean to throw up my hands like a statistical nihilist and say we have to go back to interviewing people on the street. There are plenty of problems that statistics can solve reasonably well. Indeed, if we could go back in time four years and substitute this year’s Donald Trump for that year’s Mitt Romney, and run Trump against the 2012 Barack Obama, we could learn a lot about the differences between Romney supporters and Trump supporters.

But the problem with presidential elections is the same one that exists in macroeconomics: lots of things change over the same timeframe. As I said above, Trump is doing 17 points better among low-income families than you would expect based solely on Romney’s performance four years ago. On its face, that seems like strong evidence for the economic anxiety believers. But remember, Romney was a patrician who made his fortune in private equity, whom the Obama campaign successfully demonized as a ruthless job-killer, and a Mormon to boot. Almost any Republican would run better among the poor than Romney. And Obama was a child of a single mother who became a community organizer, while Clinton is the epitome of the moneyed Democratic establishment. How much of that 17 points is due to the fact that Clinton isn’t Obama, and that any generic Republican isn’t Romney? We just don’t know.

I think I could conclude here, but I want to talk about the primaries for a moment, because they illustrate another problem with this whole endeavor.

Matthews’s conclusion about the primaries (I’ll skip the sources for now, but he draws heavily on Lee Drutman and others) is this:

There is a segment of the Republican Party that is opposed to racial equality. It has increased in numbers in reaction to the election of a black president. The result was that an anti–racial equality candidate won the Republican nomination.

I’m skipping the sources because I see no reason to argue with this. Trump was the most overtly racist of the Republican candidates, and a major reason for his success was the rise in racist sentiment among the party base. This is true, and it’s bad, and it’s worrying.

But that isn’t evidence against the economic anxiety theory. It’s eminently plausible that economic dislocation makes people more receptive to racism. In fact, that’s part of the conventional historical narrative about the rise of Hitler and the Nazis, although I’m sure there are alternate theories. (It’s been more than twenty years since I studied this in graduate school, so I looked it up, and the Encyclopædia Britannica agrees.) And since the same group of people — lower-income, less-educated whites — is correlated with both poor economic outcomes and racist sentiments, it’s pretty near impossible to say from poll data whether Trump support is being driven by one and not the other.

Saying Trump is riding a wave of racism, as Matthews does, is all we need to know from an immediate practical perspective: we know we have to stop him. But it doesn’t answer the question of why racism is so popular. Racism isn’t a virus that falls out of the sky. It’s the product of historical contexts. I can’t prove that today’s heightened racism results from the Great Recession, although it seems perfectly plausible to me. But by the same token, saying “It’s racism!” doesn’t preclude the role of economic factors in making that racism attractive.

Matthews concludes his article with a moralizing critique of journalists who don’t want to call poor people racists and therefore cling to their economic anxiety narrative in the face of the evidence. Well. I think I’ve shown pretty convincingly that, insofar as the poll data say anything, they don’t say what Matthews thinks they say. (Seventeen percentage points!) But I’m not going to claim that this is incontrovertible proof of the economic anxiety hypothesis.

I think the lesson of all of this is that data journalism is a great idea, but it only works when the data are good enough, and when the journalist knows the limits of the data. But what I really want you to take away is this: Those economic anxiety tweets? It’s just a bad joke.

James Kwak is the author of Economism: Bad Economics and the Rise of Inequality, available on January 10. He is a professor of law at the University of Connecticut, the vice chair of the Southern Center for Human Rights, and a co-author of 13 Bankers and White House Burning. He previously worked at McKinsey, Ariba, and Guidewire Software. Find more at Twitter, Facebook, Medium, The Baseline Scenario, The Atlantic, or