Winner winner chicken dinner

Or how to forecast seat-level winners with limited data

In December of last year, the British Election Study surveyed several thousand people across the country.

Twenty-eight of those people lived in Aldershot.

Of those 28, 15 said that they intended to vote for the Conservative party in the next general election, whenever that might be. Four said they would vote for Labour, five for the Liberal Democrats, and three for UKIP. One person said that they would vote for the Green party.

No one would ever want to forecast the result in Aldershot on the basis of this information alone. Even supposing these people represented a random sample from the Aldershottian population, the margin of error on a poll of 28 people is around twenty percent.

Still… we could interpret this data very gingerly on the basis of what we know about Aldershot, couldn't we?

We could make a really dumb prediction about what we would expect amongst the Aldershot 28, based on what happened last time. In 2015, just over 50% of voters in Aldershot voted Conservative. As it happens, the actual share of Conservative voters in our tiny subsample is very slightly higher than that, at 54%. This might mean that the Conservatives are doing a little better everywhere, or that they’re only doing better where, as in Aldershot, they were doing well before.

If we had lots of tiny subsamples like this, we could refine this really dumb prediction. We could start testing whether the Conservatives are doing better in some regions but not in others. No single sample on its own would be enough to say one way or the other — but patterns present across several subsamples would suffice.

Of course, we couldn’t just focus on the Conservatives — we’d have to think about all the other parties at the same time. But the task doesn’t seem impossible.

Multinomial modelling

This example hopefully gives the intuition behind some of the models that I use to predict seat level outcomes in the general election. (By "the outcome", I mean the number of voters in the subsample who say they’ll vote for each party). I’ve taken information from the British Election Study, and I’ve tried to predict the outcome in each of these subsamples, based on what I already know about each constituency.

Because I'm simultaneously modelling counts for seven parties (eight in Scotland and Wales), the model I use (a Dirichlet-multinomial model) is necessarily somewhat complicated. Each bit of information I use to predict how the subsample will respond has a little effect on the probability of responding Conservative, a little effect on the probability of responding Labour, and so on. (And because I have separate models for England, Scotland and Wales, there are separate effects in each country).

This means that the easiest way of presenting this model is to show what happens when one thing changes at a time. I've used the following pieces of information to predict outcomes:

  • the vote shares for the Conservatives, Labour, the Liberal Democrats, and UKIP in 2015
  • the winner of the seat in 2015
  • the result of the referendum in that constituency
  • the region

Past vote shares are so important that I've allowed the effects of these to vary according to winner, region, and referendum outcome. Again, this introduces more complexity — but it does help in predicting the outcome.

Here, for example, are the predictions I'd make for each constituency subsample within the BES data, as the Labour vote share for that constituency increases. These are predictions for a seat in the North West, which voted to Leave the EU by 53.5%, and which is already held by Labour.

You'll see that Labour is predicted to do better where it did better in 2015. It would be pretty surprising if the model didn't predict this. An assumption like this is the basis for uniform national swing, a ridiculously simple (but ridiculously effective) alternative way of providing seat-level forecasts.

At the same time, the prediction for Labour almost always falls short of their performance in 2015 (represented by the dotted diagonal line). Labour is predicted to be ahead of its 2015 performance in a small number of seats — but these are seats it has no chance of winning.

As the prediction for Labour increases, the prediction for the other parties goes down, since the predictions have to sum to 100%.

What about the predictions not just for seats based on how Labour did in 2015, but for different types of seats? Here's the plot which shows the predictions for Labour across different regions. (Once again, these are predictions for BES subsamples, rather than predictions for the election: I'll come to that later).

The plot shows that — at least in the BES data, and in seats Labour currently holds — Labour is falling behind more in the West Midlands and London than in other regions.

These plots have shown what we'd expected to happen to the Labour vote share in Labour-held seats — but what about Conservative-held seats? It's not impossible that Labour might fall back less in Conservative-held seats than elsewhere.

In this region, there's no real evidence that Labour are doing worse in seats they currently hold (contrary to arguments set out in a previous post). Of course, this is just a plot for one region, and since the effects of previous vote vary by region, there are eight other plots that I could show, one for each region. I'm sure one of these plots vindicates my original argument…

Finally, what about the Brexit effect — are there differences in what we would predict based on how the constituency voted in the EU membership referendum?

Not really. The trend lines are identical — at least for Labour-held seats in the North West region (which as you'll remember was the example configuration we started out with).

All of these graphs are stylised representations of what we would expect to happen to the Labour vote, based on the data found in the British Election Study's tenth wave. I could go on to show equivalent graphs for all the other parties — but to save space, I'll go on to describe how these can be used as the basis for forecasts.

From predictions to forecasts

Whenever I've talked about "predictions" above, I've intended that to mean "predictions about these tiny subsamples". In order to move from predictions about tiny-subsamples, to predictions about vote share, I need to bring these predictions in line with national polling. That is, I need to shift these lines up or down so that the vote share won by parties in different seats matches the final outcome I forecast.

That's a little bit complicated, because different seats have different numbers of voters in them, and those people turn out to vote at different rates. But bringing these numbers into line is absolutely necessary, because the headline polling figures have changed substantially since December 2016.

Right now, a whole bunch of polling people are probably having conniptions about the way I've used unweighted data and am now extrapolating to the broader population. Surprisingly, when you're trying to predict constituency opinion, getting good constituency-level covariates is more important than weighting properly through post-stratification (see figure 2 in this paper). People talk a good game about multilevel regression and post-stratification, and that's cool, but most of the action happens through the constituency covariates — things like past vote, and so on.

Having said that, I do need to fess up: I'm not using these unvarnished predictions, but rather I'm blending them with "predictions" derived from uniform national swing. This isn't entirely ad-hoc: it turns out (if you're using data from December 2014 to predict the May 2015 election) that a 40:60 blend of model-based predictions and uniform national swing works better than either method does individually.

What next?

I've now described the three main elements of the forecast model:

  1. Moving from final polls to outcomes
  2. Moving from today's polls to final polls
  3. Predicting seat-level outcomes and bringing them in line with vote share outcomes (this post)

Over the next week or so I'll try and illustrate some of the predictions I'm making for particular constituencies.

(Nerd note: You can find the code for the Dirichlet-multinomial at Github. I've used VGAM for this. I considered using multinomRob, which deals with robustness as well as over-dispersion, but it's really slow and I am impatient. I haven't included in the source the UNS add-on, where I did some additional funky stuff with compositional noise, calibrated to match departures from UNS in previous elections).