Google searches and the U.S. presidential election

What the polls may have missed

Emily A. Halford
Digital Diplomacy

--

Photo by Element5 Digital from Pexels

Background

Most of us here in the U.S. are waiting with bated breath for the results of next week’s enormously consequential presidential election. Virtually all of the data providing insight into the likely outcome comes in the form of polling data, which, while extremely valuable, is also inherently imperfect. Selection bias arises from the fact that it is nearly impossible to get a random sample of voters with traditional polling methods, and means that polls often do not actually represent the population that they intend to capture. Polling data is also notoriously susceptible to social desirability bias — there is a strong motivation to respond with the answer perceived as being most socially acceptable, rather than the answer aligned with one’s true beliefs. Americans may report support for the Black Lives Matter movement while responding to a poll, for example, even while harboring racist views that lead them to vote for Donald Trump.

In 2016, polls generally underestimated Trump’s performance by 2 full percentage points. However, an unexpected data source — Google searches — was able to identify geographic areas where Trump would perform better than the polls predicted. While Google searches are subject to some selection bias (individuals…

--

--

Emily A. Halford
Digital Diplomacy

I am currently a data analyst working in psychiatric epidemiology, and I am excited about the intersection of data science and mental health. Views are my own.