Part 1: Looking at the Emerson Polling Data

Justin Herman
4 min readApr 25, 2019

Recent news reports Dailykos and Forbes are using polling data from April’s Emerson poll to claim that Bernie Sanders supporters are choosing Donald Trump over other Democratic candidates. In particular, these articles are focusing on Sanders voters voting for Trump over Elizabeth Warren (26% of Sanders voters prefer Trump to Warren)

The Emerson poll identifies voters first choice in the Primary. The poll then asks voters who they would vote for in a Trump versus every other candidate scenario. In terms of ideology, Warren and Sanders are objectively very similar to each other. Is 26% of Sanders voters voting against Warren indicative of sexism, or some sort of “Bernie Bro” phenomenon? Let’s explore the data

Is the Data Trustworthy?

My first issue with the polling data, is it’s completeness. All 356 potential Democratic party voters answered every single question

  • I have not looked through Emerson’s methodology as of yet.
  • Presumably, the goal of the survey is to get national polling averages.
  • Given this goal, incomplete surveys where respondents pick their preferred candidate but don’t provide answers to periphery questions, still have significant value.
  • It appears Emerson has discarded all incomplete survey observations. This means if a respondent missed even a single question, they have been discarded from the data. It is also possible Emerson used some sort of imputation techniques to deal with missing data
  • Assuming no data imputation or manipulation, accepting only complete survey observations adds bias into the survey data. Most people reading this esoteric analysis would not sit through a complete survey, let alone the average voter.Voters giving this much attention to all of these questions are not representative of average voters. Depending on how many surveys were discarded, our dataset can have significant bias
  • However, let’s proceed with the analysis assuming the data is clean

Table demonstrating completeness of Data

Table demonstrating completeness of Data

Get Distributions of Votes for Trump Against Democratic Opponents

Rows are people that chose those candidates as their primary choice. Columns show how those potential voters vote, given election matchups.

Weird Behavior in Data

Democratic party voters are choosing Trump over their preferred Democratic candidate?

  • 10.3% of Biden voters would vote for Trump against Biden
  • 5% of Bernie voters would vote for Trump against Bernie
  • 9.3% of Beto voters would vote Trump against Beto
  • 2.2% of Kamala voters would vote for Trump against Kamala
  • 11.9% of Warren voters would vote for Trump against Warren

This is really odd. Biden, Warren, and Beto voters who identify as having those candidates as their first choice choose Trump over those candidates. Having not looked at other polls, this behavior could be within norms, but nonetheless it’s strange and should be highlighted:

  • Overall 273 respondents said they would vote for Trump in one on one matchups, out of a total of 1780, which comes out to 15.3%
  • Overall 84 Bernie supporters said they would vote for Trump in one on one matchups, out of a total of 520 matchups, which comes out to 16.1%
  • Overall 54 Biden supporters said they would vote for Trump in one on one matchups, out of a total of 425 matchups, which comes out to 12.7%
  • Overall 9 Warren supporters said they would vote for Trump in one on one matchups, out of a total of 120 matchups, which comes out to 7.5%; this is odd, given 12% of Warren voters said they would vote against Warren versus Trump.
  • Overall 14 Beto supporters said they would vote for Trump in one on one matchups, out of a total of 145 matchups, which comes out to 9.6%
  • Overall 5 Kamala supporters said they would vote for Trump in one on one matchups, out of a total of 140 matchups, which comes out to 3.7 %

What does this Mean?

Looking at this data, Bernie supporters are only slightly more likely to vote for Trump(16.1%), than the average of the dataset(15.3%). However, Bernie supporters make up 27% of the dataset, which can significantly shape the average. Emerson has released two prior polls; March and Feb. This data needs to be cleaned and explored. I am hopeful that doing so, will give us a bigger picture on how sample size and other factors played into these distributions and outcomes.

To see the code for the above article, please visit my github repository

To see my other projects please visit DataScience projects. If you would like to contact me, please reach out to me on Linkedin

--

--