How Analytics Can Improve Polling: A Case Study from South Carolina

Patrick Ruffini
Echelon Indicators
Published in
5 min readFeb 23, 2016

Campaign pollsters have been critical of media pollsters this election season for releasing polls that seem to assume turnout well beyond what we would ever see in a primary. The typical national media poll will interview 1,000 adults, selected at random, from which we will see samples of 300 or 400 respondents for the Republican and Democratic primaries. Even using registered voters as a base, this would equate to primary electorates of between than 60 t0 80 million. In 2008’s 50-state Obama-Clinton primary, turnout was 36 million, and in the 2012 Republican primary, turnout was 18 million.

This means that it’s very likely that many polls are speaking to people who will never vote in the primary. And if respondents who won’t vote prefer someone different than those who will, this could skew the results.

This doesn’t make media polls bad polls. Media polls have competing objectives, one of which is to understand the attitudes and opinions of the general electorate. But calling voters at random with little information about the turnout history is a technique that leaves much to be desired in primaries where turnout can be low, and vary dramatically from state to state. Research also shows that traditional techniques for gauging likelihood to vote, namely, simply asking voters a battery of questions about their interest in the election, don’t stack up quite as well as simply looking at the respondent’s vote history.

To demonstrate how a different approach to gauging voter turnout could work, Echelon Insights surveyed 935 South Carolina voters about the Republican primary on Thursday and Friday nights and we sent the results over to Huffington Post Pollster on Saturday morning. We’re happy to report that topline results matched up with the final outcome, with Donald Trump leading in our “most likely” scenario by 10.7 points and Marco Rubio edging Ted Cruz by one percentage point. (Trump won the primary by 10 points and Rubio placed second by 0.3 percent.)

Scenario-based estimates for candidate support at various turnout levels in the South Carolina Republican primary

Our goal was not simply to provide an accurate estimate of the final outcome. Using turnout scores measuring the likelihood that a registered voter in South Carolina would vote in the Republican primary, we constructed various scenarios, which you’ll see above. In one scenario, we weighted the data like a traditional poll, by demographics like age and gender. With the others, we did something different. We factored in a respondent’s likelihood to actually turn out and vote based on a turnout model we generated specific to the 2016 primary. We then reported estimates of support at various turnout levels. These ranged from an electorate that looked like 2012’s 603,000 votes all the way up to 800,000. Our “middle of the road” scenario, which we reported to the Huffington Post, assumed a turnout of 685,000. (Actual turnout was 730,000.)

How can factoring in turnout scores help? In our unweighted sample, 30% of the respondents who answered the survey had less than a 50% likelihood of voting on primary day. According to our simulations, this would be consistent with a blowout turnout well over 800,000. Testing with a universe where primary turnout was closer to 700,000, we found that low turnout voters should consist of no more than 22% of the electorate. So, this is one way we weighted that scenario, while also allowing for the possibility that turnout could exceed all expectations through a different scenario.

Calibrating different scenarios based on the breakdown of likely and unlikely voters in the electorate allows us to assess how outcomes for different candidates might change as turnout changes, giving a more textured and nuanced view of the electoral dynamics at play. There is a clear connection between support for Donald Trump and turnout: in South Carolina, he received 36 percent support from those with less than a 50 percent chance of voting, and 26 percent from those with a 95 percent or more chance of voting. Crucially, though, he led across all turnout groups.

These differences weren’t enough to dramatically change the outcome under any plausible turnout scenario in South Carolina. In later voting states without an early state tradition and saturation media coverage driving turnout, we might expect lower turnout to lead to a modest reduction in Trump support levels, but one that’s unlikely to cost him more than 2 percent. Where this matters more is in caucus states.

By applying turnout scores in generating these scenarios, we also saw Rubio’s lead over Cruz shrink from two points to one, mostly because Cruz did slightly better overall with high propensity voters (though Rubio did better with very high propensity voters). In this case, a traditional weighting scheme that didn’t take turnout scores into account helped Rubio, while factoring in these scores tended to hurt him by about 1 percent across scenarios.

We think that merging the disciplines of data analytics and traditional opinion research gives you more accurate polling. By using turnout scores to inform both sample selection and weighting, we can achieve more reliable, scenario-based estimates of where an election actually stands, even if the unexpected happens. And the ability to tie everything back to voter files and turnout scores, this approach makes more sense than one in which your pollster and your technology, targeting, and analytics are two completely separate teams.

We hope you’ll read more details on our South Carolina study over at the Huffington Post, and you can also read the crosstabs of our likeliest scenario. Have more questions? Please don’t hesitate to get in touch.

About Echelon Insights
We combine research, analytics, and digital intelligence to deliver holistic, actionable insights to guide strategy for political campaigns, corporations, and nonprofits. Learn more about our approach at EchelonInsights.com, or subscribe to us right here on Medium.

--

--

Patrick Ruffini
Echelon Indicators

Polling/analytics. Digital ex. Co-Founder @EchelonInsights.