Estimating the Presidential winner through the Electoral College

Mark Huber
5 min readSep 29, 2020

--

Even knowing the probabilities for each state leaves open important modeling decisions.

The United States system of electing its President relies on the use of the electoral college. This consists of delegates from the 50 states and the District of Columbia. Each state is awarded a number of electors equal to the number of Representatives (which is based on population) and the number of Senators (which is always two.) Altogether, that gives 538 electors.

Forty-eight states and the District of Columbia have a winner take all approach where all of a states’ electors must vote the way their state decides.

The states of Nebraska and Maine have a proportionality system where the number of electors assigned to each candidate based on the states’ vote. Because Nebraska is heavily favored to go Republican and Maine heavily favored to go Democratic, this will be ignored in the analysis below for simplicity.

The outcome of each state can be treated as being determined by a flip of an unfair coin. Mathematicians call such a flip a Bernoulli random variable. The number 1 on the Bernoulli indicates heads, which here will be Democratic, while a 0 indicates tails and Republican. If the Bernoulli random variable for state i is called B_i, and the state has E_i electoral votes, then the votes towards the Democrats is B_i E_i, and the votes towards the Republicans in (1 — B_i)E_i. The question is, which is bigger,

If the Electoral College has a tie vote, then the election is decided by the House of Representatives, but not by majority vote. Instead, the Representatives of each state votes, and the majority of the state decisions determines the President. That happens when the left hand and right hand sums equal one another at 269.

The probability of heads for each of these Bernoulli random variables can be estimated using a prediction market. The values used in this analysis was drawn from PredictIt.com at 2020–09–23–1630 PDT.

An estimate of the Electoral College probabilities on 2020–09–28 using the prediction markets at PredictIt.com

Because the predictions are created using a stock model, the values of the probabilities have to fall on an integer number of cents. Hence to get the probability, the value for each state was increased by 0.5%. Hence Nevada was taken to have a 75.5% chance of going Democratic, while Florida was given a 54.5% chance to go Republican, and a 45.5% chance of going Democratic.

Simulation decisions. So at this point we are ready to simulate the outcome of the Electoral College, right? Well, not quite yet. You see, there is still the question of how correlated the state outcomes will be. For instance, if the Democratic candidate makes a gaffe, that could reduce the probability of winning in several states simultaneously. Similarly, if the Republican candidate has a scandal, then that could reduce their chance of winning in multiple states at the same time.

To illustrate how much differing models of correlation can affect the outcome, consider two models based on the use of randomly drawn variates. Each gives the same probabilities to the states (referred to as the marginal distributions) but the correlated for pairs of states is very different.

To flip a Bernoulli coin, the usual method is to first generate a uniform random variable. This is a real number that lies between 0 and 1. Intuitively, it can be thought of as being spread out evenly over the numbers from 0 to 1. More formally, a uniform is equally likely to land in any two nonoverlapping intervals between 0 and 1 that have the same length. For instance, a uniform has the same chance of landing between 0.1 and 0.2 as it does of landing between 0.5 and 0.6.

To go from a uniform U to a Bernoulli B with probability p of being 1, say that B = 1 if U < p or U = p, and B = 0 if U > p. Then B has p probability of being 1.

Now here is where there is a choice to be made: should the same uniform be used for all the states, or should a different, independent uniform be used for each state. The first method will be called the master uniform method, and has the most correlation between any pair of states. The second method will be called the independent uniform method, and results in zero correlation between any pair of states.

Call states with p between 30% and 70% swing states. The following table looks at these swing states.

The probabilities of voting for the Democratic candidate, together with the number of electors, and the number of electors from states with p at most equal to the state p in question.

The final column shows how many electors are bound to states with probability of voting Democratic at most equal to the state in question. For instance, Florida has a 45.5% chance of going to the Democratic candidate. There are 240 electors from states whose probability of going Democratic is at most 45.5% (including Florida itself.)

Since the winner needs 269 electors to win, if U > 0.625, then Pennsylvania and above go to the Republicans, given them at least 286 electors and the win. On the other hand, with this model, if the Democrats win Pennsylvania because U is at most 0.625, then they win at least 538–266 = 272 electors and they take the presidency. Under this model, there is no way to tie.

That is the master uniform model. But what if every state is independently decided? That is the same as flipping a coin independently for each of the electors, and using the results to determine the election. This can be simulated on a computer, or solved exactly using a technique called dynamic programming.

The results are as follows: under the independent model, Democrats win 78.38% of the time, Republicans win 20.76% and there is a 0.84% chance of a tie in the Electoral College.

So which is right? 62.5% or 78.38% for Democrats? 38.5% or 20.76% for Republicans? These come from the extremes of correlations between pairs of states. Reality is most likely a mixture of the two (or other) models. The first model assumes that someone in Ohio is equally affected by national events as someone in Florida. Given the regional nature of disasters, this is perhaps too rigid. On the other hand, the idea that every state is acting independently of how the candidates behave seems wrong as well.

As with most modeling situations, the answer is probably somewhere in the middle. The results of the election will give us some indication of how correlated the states actually were, information that can then be used to refine models for the next presidential cycle.

--

--

Mark Huber

Program Director of Data Science at Claremont McKenna College.