The “13 Keys” are garbage and you should stop paying attention to Allan Lichtman

Mac Tan
9 min readAug 7, 2020

This post is adapted from a Quora answer to the question “What do you think of Allan Lichtman and Helmut Norpoth’s predictions for the outcome of the 2020 election?”

American University historian and quadrennial cable news talking head Allan Lichtman recently published his prediction for the upcoming presidential election, and as often happens, I got a Quora question about it.

I’ve made no secret of my disdain for Allan Lichtman, not to mention the media buzz he gets every four years. I’ve written down some of my thoughts about Lichtman himself on Quora before, but they’re kinda scattered about, so it’s probably best that I get them down in one place. The conclusion I’ve come to over the years is that Allan Lichtman is basically the Dr. Oz of election forecasting. Maybe they knew what they were talking about at one point; maybe they’re still even competent in private, when there are no cameras around. But what they put out for public consumption is garbage, peddled by snake oil salesmen who just like to see themselves on TV. Heck, they even kinda look alike.

Anyone ever see these two in a room together?

I should probably explain why I think so little of Lichtman. It’s not that he has a model — for a very, very liberal definition of the word — that performed badly. That would be fine. Building good models is hard, and you shouldn’t shame people for failing to do it unless it’s their job (and it isn’t in this case). It’s that he tries to pass himself as some imparter of secret wisdom about how the American electorate behaves — and worse still, that he’s succeeded in doing so to the majority of people who’ve heard of him.

What really ticks me off, though, is how he’s done it. His main bragging point is that his “13 Keys to the White House” — a set of 13 yes/no questions, first published in 1996, where the incumbent party wins the popular vote if the answer to eight or more of them is “yes” — would have accurately predicted the winner of the popular vote in every election since 1860. And what I hate — hate — about this argument is that it sounds impressive superficially: how can a guy who calls 40 consecutive elections not know something that we don’t?

Well, for one thing, it’s not true — he incorrectly predicted that Donald Trump would win the popular vote in 2016, and he also somehow contrived to predict that Al Gore would win the election in 2000 when his Keys purport only to predict the winner of the popular vote. But let me digress for a second. Let’s reduce the number of predictors to just one, which we’ll call X, and let’s use X to predict the value of Y, a numeric variable rather than a binary variable (like “does the incumbent party win?”) using the following data:

What would you predict the value of Y to be if X=4? Well, one thing you could do is just draw a straight line that fits as best you can to the points you have, which would look something like this:

in which case by going over to 4 on the horizontal axis and finding how high the line is at that point, you’d come up with a prediction of around 8 for Y.

But you know what? That line doesn’t fit that well. It misses some points by a lot. Let’s try to correct for that by allowing the line to curve so it gets closer to those points (I’ve done this below by fitting a loess smoother with a small span parameter, but you could do the same thing conceptually by adding a bunch of polynomial terms to the regression equation):

Better, right? The curve goes through a lot more of the points now, so it’s a better-fitting model. And it has none of the bad misses that the line had. This model would predict Y=-4 when X=4.

Which is the “right” prediction? Strictly speaking, I can’t say what the true value should be — this is fake data that I made up in 30 seconds, after all. What I can say is that the first model is certainly far more justifiable than the second. A straight line looks quite reasonable as a fit to the data, even if its predictions are quite noisy; there’s really no reason, given the data, to suspect that the real phenomenon follows the super-wiggly curve of the second model, randomly jumping from high to low and back again with seemingly little rhyme or reason. And in fact, I can go further and say that the first model is objectively a better one, using metrics typically used to evaluate models like AIC or cross-validation error. In statistics, we say that the second model is overfitted to the data: it follows the data it’s seen already very closely, but at the expense of its predictions on data it hasn’t seen.

Why this digression on math? Because Lichtman’s 13 Keys are basically an example of the second model. He tailored the keys too closely to the elections the model had seen. When his Keys (at the time only twelve of them) were first published in a 1981 paper with Russian seismologist Vladimir Keilis-Borok, they had been determined using the 31 elections since 1860. In the context of the graphs above, that would be sort of like fitting a line to 31 points, but allowing the line to bend 11 times. Hitting all 31 points is not at all impressive when you can do that — in fact, the Keys’ performance is even more mediocre than that, as it’s even easier to hit the points when they’re yes-or-no questions, restricted to two possible values, rather than a number that could hypothetically take on any value between -100% and 100%.

And, like all overfitted models, its performance on data that it hadn’t seen before when being built is far less impressive than its data on the data used to build it. Lichtman updated his Keys in his 1996 book, expanding the number to 13; in the six elections we’ve had since, he has missed the popular vote (which is what the Keys purport to predict) in two (2000 and 2016), and on top of that one of the ones he called correctly, the 1996 election, was so lopsided that nobody, not even the Republican National Committee, ever had any serious expectation that Bob Dole would win once the campaign had begun in earnest. (Bill Clinton’s 379-electoral vote total and nine-point popular vote win have not been surpassed since.)

In other words, Allan Lichtman is better than flipping a coin. But not by much.

But there’s more. If the 13 Keys were any good at all, you’d expect that it would do a reasonable job predicting not just who wins the popular vote but the margin as well. Nate Silver did an analysis of the Keys’ track record on this measure in 2011, and what he found was… lackluster.

The more keys the incumbent has, the better they do in the popular vote, so that’s a good sign. Unfortunately, that’s about the only good sign. There’s a lot of variation around the predictions a linear regression would make — the 95% confidence interval around it is around 32 percentage points wide. Taking the 2016 election as an example, had Lichtman elected to publicize the actual uncertainty around the popular vote prediction implied by his Keys, what he could say with pretty high confidence would be that the margin could be anywhere between a 12-point Clinton blowout and a 20-point Trump landslide. (No wonder he never even tries to predict the actual popular vote margin.)

Point being, Lichtman’s actual track record is very unimpressive. So why does he impress people anyway? Well, one reason is that most people don’t understand overfitting, but another reason is that his “keys” actually sound reasonable. On first blush, they all make some semblance of sense:

1. Party mandate: After the midterm elections, the incumbent party holds more seats in the House of Representatives than it did after the previous midterm elections.
2.
Contest: There is no serious contest for the incumbent party nomination.
3.
Incumbency: The incumbent party’s candidate is the sitting president.
4.
Third Party: There is no significant third party challenge.
5.
Short-term economy: The economy is not in recession during the election campaign.
6.
Long-term economy: Real per capita economic growth during the term equals or exceeds mean growth during the previous two terms.
7.
Policy change: The incumbent administration effects major changes in national policy.
8.
Social unrest: There is no sustained social unrest during the term.
9.
Scandal: The incumbent administration is untainted by major scandal.
10.
Foreign/military failure: The incumbent administration suffers no major failure in foreign or military affairs.
11.
Foreign/military success: The incumbent administration achieves a major success in foreign or military affairs.
12.
Incumbent charisma: The incumbent party’s candidate is charismatic or a national hero.
13.
Challenger charisma: The challenging party’s candidate is not charismatic or a national hero.

But while they sound reasonable, some of them are flexible enough that you can adjust your interpretation of the keys to suit your own prior beliefs. In 2008, for example, Lichtman gave key #13, “challenger charisma”, to Barack Obama, which I don’t think anyone really disputes, but did not give key #12 to John McCain. Most people would agree McCain wasn’t the most engaging speaker, but there had been pretty broad and bipartisan agreement for about 30 years by that point that he was a war hero, which should, by Lichtman’s definition, have given the Republicans key #12 in that election.

And then there’s others that are ill defined enough that it’s really hard to decide one way or another whether they qualify. Did the Tea Party protests of 2009 and 2010 count as “sustained social unrest”? What about the protests against the Iraq War prior to the 2004 election? And who counts as a “significant third party challenge?” In 2016 Lichtman seemed to rate Libertarian candidate Gary Johnson as a “significant” challenger, but he ended up winning just 3% of the vote.

More than that, though, the very fact that he chooses the Keys to begin with should raise eyebrows. Yes, these factors sound reasonable, but so do a lot of others. As Silver notes in the aforementioned analysis, if you have to choose 13 factors from, say, 25 potential candidates, you have over 5 million combinations of factors to choose from. (Even more if you’re flexible with the number of factors you choose, as Lichtman evidently is.) Some of those combinations are bound to get you a perfect historical track record, and if you rate your method based on how it does on past data it’s not hard to pick the right ones.

I’m not saying that either Lichtman or Norpoth are wrong about this election. Currently my forecast agrees with Lichtman’s prediction that Biden will win (and, incidentally, thinks Norpoth’s 91% confidence that Trump will win is absolutely silly). But I don’t have to agree with someone to respect them, and likewise just because I happen to agree with someone on something doesn’t mean I have to have any respect for them whatsoever.

--

--

Mac Tan

Quora Top Writer, data scientist/junkie, and year-round election forecaster.