Is 538’s forecast being driven by trendline assumptions?

George Berry
5 min readNov 6, 2016

--

Overview

FiveThirtyEight’s election model is an outlier compared to other polls-based models, giving Trump a 1/3 shot of winning. Following up on the recent controversy between Nate Silver and HuffPo editor Ryan Grim, I re-read 538’s modeling assumptions. Today, I did some basic data analysis using a really simple poll aggregation method (median of all the polls), but incorporating the trendline adjustment similar to the one that 538 uses. It turns out that, depending on how you implement trendline adjustment, you get very different results. I’m going to show the sensitivity here.

By keeping the model simple and varying only the way of computing trend lines, we can look at its effect more directly. Please don’t come bug me that these predictions were wrong. Code is here.

What is trend line adjustment?

In a nutshell, the trend line adjustment works like this: let’s say we conduct a poll in North Carolina in August. It’s now September and we want to incorporate it into our model, but the national polls have changed since the NC poll was taken. We borrow information from the national polling change from August to September and re-weight (change the outcome of) the state poll. To be concrete, say that the NC poll shows Trump +1 in August, and we are making a forecast in September. Since the NC poll was taken, assume Trump has gained +3 nationally. Then we add some of that +3 (according to a weight) to the August NC poll. If our weight is 1, we would add +3 to the poll, and enter a +4 August NC poll into our model.

This methodology works if we have an accurate estimate of the weight, and an accurate estimate of the trendline. If our weight or trendline estimate are off, then our results might be wrong. Estimating the trendline is done by LOESS regression in 538’s model, so I’ll repeat that here.

Data

I used the Huffington Post Pollster API to get all national polls (N = 200) since January 2016, and state polls from commonly-cited battlegrounds: VA, PA, CO, NH, FL, NV, OH, IA, NC, AZ, GA (N = 437). These are likely voter polls as identified by the API. Because this is a simple analysis, I’m not worrying about pollster quality, house-effects, etc., and one could undoubtedly incorporate more polls (e.g. registered voters).

If I say NV +3, it means Clinton leads by 3, whereas NV -3 indicates Trump leads by 3.

Model

Our model of outcomes is simple: take the median of all polls over the whole period and call that our estimate. PEC likes medians, so that seems fine to repeat. I don’t do any time-decay because I’d really just like to do the most straightforward thing that seems reasonable.

At the national level, this produces Clinton +4. Compare to RCP’s 4 way average of +2.2, and HuffPo’s +4.9.

Doing the same thing at the state level gives: AZ -2, CO +5, FL +2.5, GA -4, IA -1, NC +1, NH +6, NV +0.5, OH +2, PA +6, VA +7.5, WI +5.0. These are not terrible estimates. The worst one is NH +6, when The Upshot gives +3. Compared to The Upshot, we get WI, IA, GA correct; VA within 0.5; AZ, CO, FL, NC, PA within 1 point; NV within 1.5; OH within 2.

#LiterallyNotTerrible

National trend

Using LOESS regression requires deciding how big we should make our smoothing window. Think of this as how much information we want to incorporate on either side of the point we are considering. Lower smoothing values are more local (jumpier line), while higher ones are more global (smoother line). In R, the default parameter is 0.75.

Here’s a picture of the national trendline using smoothing of 0.65 (left) and 0.85 (right). The outcomes are qualitatively different for election day. In the former case, a Trump victory is in the margin of error (dashed line is even). In the latter case, Clinton’s lead appears quite safe.

Left: smoothing = 0.65. Right: smoothing = 0.85. Gray bars are 95% confidence bands computed by R’s loess() function.

We’ve established that this smoothing parameter matters for the trendline, and this is a model that 538 uses for trendline adjustment. If we had enough data, we could cross-validate to learn this parameter on past outcomes, but there haven’t been that many elections. I’m not sure how 538 determine the smoothing parameter (or what it is), but it’s clear that the trendline itself is sensitive to it.

So how sensitive is the model to the smoothing value?

I thought you’d never ask. Here are results for four smoothing parameters (0.7, 0.75, 0.8, 0.85), plus no smoothing. These are computed by taking the national trendline difference (output by the LOESS model) between today and when the poll was taken. We then add this difference to the poll. Within a state, we then take the median of all the polls after adjustment.

Smoothing values of 0.75 and below advantage Trump, while those 0.8 and above advantage Clinton. The red dots (no smoothing) are somewhere between the extremes, but tend to favor Clinton. Qualitatively, a value between 0.7 and 0.75 lines up with the current 538 model predictions in states like FL, NC, NV, OH, which are currently just-barely-Trump. Raw values are considerably more pro-Clinton than smoothing values below 0.8.

State median outcomes by model smoothing value. Red dots are raw values without smoothing.

You might say “you’re assuming we just add the (smoothed) national change” without weighting. You would be correct, but I’d argue that this is a reasonable first pass. Since we don’t know the specific weighting procedure used by 538, I’m doing something simple that you can easily extrapolate from.

Maybe this is why 538’s forecast is so different

Most models of the election (e.g. PEC, HuffPo) do not do the type of trendline adjusting that 538 does. Using just a simple median of all the polls, it’s pretty easy to come up with numbers that are relatively in line with the consensus that Clinton has a high probability of winning. However, once we start adjusting for trendlines with certain model parameters, we can convince ourselves that things might actually be looking up for Trump. There actually seems to be an inflection point in the smoothing parameter, indicating that we should have high confidence that our parameter is correct before making predictions based on it.

It’ll take until Tuesday to figure out which forecasts are correct. However, I’d say that I’m concerned by the seeming arbitrariness of the trendline adjustment approach.

--

--