The Problem of “Likely” Voters

Farrah Bostic
First Person Projects
6 min readMay 29, 2019

At a recent gathering of AAPOR (American Association of Public Opinion Researchers), researchers from Harvard, Tufts and U. Mass-Amherst presented their paper, published in January, titled “The Elusive Likely Voter: Improving Electoral Predictions with More Informed Vote Propensity Models” (link).

You’re probably familiar with the phrase “likely voters”. Most political polls frame the presentation of their polling numbers as representing the current sentiment “among likely voters”. But who are they? For that matter, who are unlikely voters? How should we understand what makes someone a likely voter or non-voter?

It turns out that this is actually a hard statistical modeling problem — because some of the people you might picture as likely or unlikely voters don’t always behave the way you think. Besides, people are neither especially accurate in predicting their own behavior, nor in reporting their behavior retrospectively. When models are based on self-reporting and self-predicting, what’s a pollster to do?

Why Does This Matter?

Accurate voter models matter a lot — to campaigns, political pollsters, and to political journalists covering campaigns. Having a clear enough picture of who likely voters are, and what their preferences are, enables these entities to focus their resources — better sample design makes polling more valuable, and pollster predictions often drive campaign spending and news coverage.

As we’ve written about elsewhere, the Culture of Prediction dominates our political discourse and our political organizing. We think prediction often obscures understanding, and that elections are only one metric for understanding the broader electorate. There is a lot of tea-leaf reading of tracking polls during an election cycle — not unlike business news outlets attributing a dip in the markets to a single newsworthy factor when there is likely more complexity to investors’ decision-making. We think there is a lot of complexity in how voters make their decisions, too — and while we think that instruments tooled for prediction are inadequate to the task of unraveling that complexity, we do think that the assessment of likelihood to vote could be an important indicator, not just of how an election will turn out, but of how engaged in civic life the electorate is at any given moment.

What Did We Learn?

Here are a few things we think you should know about likely voters based on the paper’s findings.

  1. There is no standard definition (yet) of likely voters that all pollsters use. There are recommended batteries of six or seven questions that ANES (link), Perry-Gallup (link) and others use to get an indication of likelihood to vote. When you hear pollsters use the phrase “among likely voters”, it’s safe to assume that they may be using either of those methods, or some other “proprietary” method. Depending on how inclusive or restrictive a pollster is in defining their likely voter sample, the error can be less than 2% under-reporting or up to 5% over-reporting. This lack of standardization may be combining measurement and sample errors, and therefore probably explains some of the differences in poll results.
  2. All models are inferences about an electorate that doesn’t “exist” yet. Yes, the people exist, but the pool of people who actually vote on Election Day has not yet been formed — we don’t know until after the fact who will be in it with anything approaching true certainty. So all poll samples of likely voters are measuring “potential” electorates. Polls are ways of recording the bets we’re all placing on what our behavior will be as voters (or non-voters) and on what our preferences will be come Election Day. They are inherently probabilistic, not deterministic. So the way sample should be designed for political polling probably ought to be probabilistic, too — which is one of the arguments these political scientists make in their paper.
  3. Voters and non-voters are pretty identical, demographically and in terms of political preferences. But frequent voters are different, demographically and politically, from infrequent voters. To us, this suggests a lot more deep work to understand infrequent voters should be done, because they can introduce a lot uncertainty into any model, particularly in higher turnout years, when infrequent and frequent voters vote at higher rates.
  4. Voters and non-voters are both bad at accurately identifying themselves — some who say they won’t vote do (nine percent! They could be the margin of error in a local election.) and some who say they will vote don’t. In fact, there appears to be a nearly 1 in 4 rate of churn from election to election, with voters becoming non-voters and vice versa at similar rates.
  5. Those who overreport retrospectively — that is, they say they voted, but did not — appear to be those most under social pressure to vote. In particular, the paper cited another study that identified “well-educated, high-income partisans who are engaged in public affairs, attend church regularly, and have lived in the community for a while” as especially likely to over-report their voting. This is of particular interest to us, as we have been looking at other indicators of civic participation, what drives that behavior, and how it relates to voting. We might describe this group as highly civically engaged — and yet they are not accurate reporters of their own past behavior. We wonder what the pressure to vote is like for those who experience it, and what drives resistance to that pressure.
  6. Despite all this inaccuracy in predicting future behavior and reporting past behavior, “intent to vote” is still by far the greatest predictor of actual behavior. This is especially true of those who intend not to vote — 90% of those who say they don’t intend to vote, in fact do not.
  7. To get more accurate likely voter samples, this paper argues that demographics, especially race and age, should be incorporated into the model. While demographics on their own do not appear to be highly predictive, taken together with intent, they can contribute to highly accurate turnout and partisan predictions.

What Questions Does This Paper Raise?

Here are some questions that this paper raised for us as we think about the greater context of civic engagement:

  1. How might we better understand “intent”? What do we know about the people who say they’re not going to vote and then do? What do we really know about the people who say they are going to vote and then don’t? It’s easy to default to a moral evaluation of these two groups, that one group in the end stepped up and performed their duty while the other was lazy or disengaged — but what do we really know about what drove their final decision?
  2. How might we better understand “uncertain voters”? A lot is made of the “undecided” voter, which is commonly understood to still be a likely voter who simply hasn’t made up their mind about who they will vote for. But what about the literally undecided voter — the person who hasn’t even decided if they’ll vote? How should we understand them? And how should the culture of prediction best measure and model them?
  3. How might “intent to vote” serve as an indicator of civic health? Most importantly, we’re intrigued by the possibility that the intent to vote questions, rather than being a screener or a weighting model for interpreting the sample and its responses to partisan questions, could be an interesting rolling indicator of civic engagement in an election cycle. It might give us some baseline reading of how people feel about the election (or elections) overall. The follow-up questions, then, revolve around what is influencing those intentions.

We think these questions are probably best investigated by civil society organizations and local governments who should be working to engage their constituents as much as possible. Campaigns and pollsters rely on stable predictive models — indeed, this is why most campaigns focus on turning out “their” voters rather than all voters. Net new voters create uncertainty, because infrequent or new voters haven’t been modeled, and we know that they are different from frequent and established voters.

But we think that uncertainty already exists and shouldn’t be ignored, especially in highly partisan, closely contested elections with narrow margins of victory — so we need to do the work of understanding what really drives people to be voters in a given election, and what deters them. As we begin to develop a set of indicators of civic health, we think this question of intent may be a starting point for deeper investigation of the building blocks of civic engagement, and how they are influenced by the context of any given political contest.

While we have seen some evidence that voters tend to align as either “duty-bound” voters or “transactional” voters, we wonder how closely held those alignments are, and how much the facts and experience of a political system or election can cause people to modulate between viewpoints based on context. Is this modulation responsible for the high rate of churn between voters and non-voters? Does it explain anything else about how people engage in civil society beyond voting? We’ll keep you posted as we learn more.

--

--

Farrah Bostic
First Person Projects

Founder of The Difference Engine @DifferenceNGN. I listen to humans so I can help businesses all over the world make important brand & business decisions.