Machine Learning and the End of History

“History is that certainty produced at the point where the imperfections of memory meet the inadequacies of documentation.”
— Julian Barnes [1]

Recently during my commute to work I’ve been listening to an audiobook of Antony Beevor’s “The Second World War”. It’s an encyclopedic military history of World War II and focuses on how the war itself transpired rather than addressing Hitler’s rise to power, the legacy of the First World War, or other issues relating to the context in which the war was fought. One of the things that has struck me about it, compared to other studies of history I’ve read, is that I’m about to finish the book and Beevor hasn’t yet directly presented an overarching thesis about the war.

Many works of historical nonfiction will have a conclusion that the author argues throughout. For example, Howard Zinn’s “A People’s History of the United States” hammers the reader with an anti-establishment view that emphasizes the consistency with which class warfare (generally speaking, those with power vs. those disenfranchised) has been waged in the United States. As another example, Hannah Arendt highlighted the “banality of evil” in “Eichmann in Jerusalem”, arguing that those who perpetrated the Holocaust were far from evil ideologues but, rather, were just bureaucrats carrying out their orders. In contrast, Beevor’s account presents what is, on the surface, a factual retelling of the events of World War II without summarizing for the reader themes that recurred throughout the conflict. He does present a handful of specific judgments about individuals or situations [2], leaving the thematic analysis of the Second World War as an exercise for the reader. [3]

This is history by sleight-of-hand. Any human-compiled account of a historical event (or chain of events) is, by its nature, only capturing a subset of information. Even if written as an objective collection of facts—dates, names, events, etc.—the information presented and the way in which it is laid out is a product of the (human) author. No writer has access to all of the facts and even if they did it would be (a) nearly impossible to put them all into one book and (b) certainly impossible for a reader to derive a conclusion from that volume of information or do so in an objective manner. Historians are fallible and their individual views and biases influence the works they produce. So, although Beevor presents his work as a series of facts without his own direct thesis, the facts he chooses to present and the manner in which he presents them make his argument for him.

There are essentially two ways history books are written. The first, and more flagrantly dishonest, is for an author to have an ideological conclusion that they then selectively retrieve facts to support. The second, less intentionally misleading (but ultimately a form of self-delusion on the author’s part), is for the author to pose a question (e.g. “why did the Axis lose WWII?”), research the answer from all available information, decide as objectively as possible what the strongest argument is from that evidence, and form a thesis around that.

This second route is deceptive on multiple levels. First, an author never has all of the facts, but merely the ones that for which documentation survives and is available to them. This is obviously a subset (facts available to the author) of a subset (documented facts) of reality. “History is written by the winners” is a form of meta-selection bias. Second, humans are full of cognitive biases that will affect any historian’s conclusion. There’s confirmation bias, where an individual will weigh more heavily information that confirms his or her existing viewpoint; there’s sequence bias, where even if an author enters a topic of study with no existing viewpoint, s/he becomes biased by the information presented first; and there’s selection bias (separate from the previously-mentioned meta-bias), where the information an author sees is not a representative sample of the existing documentation as a whole (forget reality as a whole). These are not the only cognitive defects affecting historical accounts, but they illustrate that humans are susceptible to all kinds of influences that subtly impact their views. In the end, many historical theses are really just a matter of chance: what information an author first encounters a preponderance of shapes their argument.

In short: although many historians strive for objectivity, humans are fundamentally bad at it.

But what if you could ingest, all at once, all of the knowable facts about a historical event? This is one of the great promises of computers and machine learning: a computer can take a wholly rational approach to the analysis of fact sets. A computer could be the ideal historian. Although creating causal chains is, at present, a difficult task (any lawyer worth their salt will know this: the “but-for” question), computers (and the ML algorithms that they can run) are getting increasingly proficient at deconstructing complex interrelationships and identifying the impact of individual inputs. Vinod wrote about this in his paper on the future of healthcare, “20-percent doctor included”:

[In] a recent tumor pathology study […] while pathologists did a good job of reading cancer tissue pathology the system learned to read the same things. But surprisingly, and completely unprompted and without any knowledge of the biology of cancer it discovered new features to look for that human researchers and thousands of pathologists had never thought of.

Jeremy Howard also illustrated this capability of deep learning algorithms at a TED talk last year:

Computers hold the promise of much faster analysis of much more complicated data sets than is currently possible by human researchers alone. Although the same data sets aren’t currently available for history—i.e. triglyceride measurements are immediately quantifiable, while textual sentiment analysis isn’t as precise (yet)—advances in the coming years will clarify previously inscrutable connections between events. In fact, I would be surprised if Palantir wasn’t tackling a crude version of this already.

In the present-day study of history, there is no objective, capital-T Truth. While the New York Times praised Beevor’s work for its attention to the Sino-Japanese War, the Guardian criticized it for its weakness in the Pacific theater. The conclusions are so starkly different as to make the audience wonder if the reviewers were reading the same book.

It is often said that those who ignore history are doomed to repeat it. But are we really any better off if the history we learn is peppered with misrepresentations and inaccuracies? The ultimate value of a more objective approach to history would be better decision-making: a clearer understanding of cause and effect and better outcomes for the community, the country, and the world.


[1] “The Sense of an Ending”; attributed to the fictional French historian Patrick Lagrange

[2] e.g. that Rommel was overrated as a general because much of his success was due to luck — his aggressive movements coinciding with moments of Allied cautiousness — rather than skill

[3] Summary of Beevor’s thesis: the ideologically-motivated Axis powers fought the war to gain empires, but needed to have an empires to win the war. While this thesis isn’t particularly original, Beevor’s narrative of the conflict is interesting if you’re a history geek with 40 hours of commute time to occupy.