Unleashing Machine Learning on Literature’s Great Works

My dreams of an ML/text startup inch toward reality.

Nick Kolakowski
The Startup
Published in
5 min readMay 14, 2019

--

Credit: Pixabay/Sarah_Loetscher

As a writer and editor who focuses largely on tech, I was instantly intrigued when OpenAI, the nonprofit ostensibly designed to prevent A.I. from being used in terrible ways, announced that it had created a “large-scale unsupervised language model” (named GPT-2) capable of generating “coherent paragraphs of text” (according to the institute’s blog posting).

Trained on a data set of 8 million web pages (featuring 1.5 billion parameters), GPT-2 could supposedly achieve “state-of-the-art performance on many language modeling benchmarks.” In other words, it could effectively predict the next word in a text string.

People freaked out, anticipating that this model would lead to the rise of superpowered “Fake News.” Fearing that very danger, OpenAI even declined to release the full version of the thing.

But then a brave soul named Adam King (@AdamDanielKing) set up a “medium-sized model” of GPT-2, dubbed 345M (because it uses 345 million parameters instead of 1.5 billion). “While GPT-2 was only trained to predict the next word in a text, it surprisingly learned basic competence in some tasks like translating between languages and answering questions,” he wrote. “That’s without ever being told that it would be evaluated on those tasks.”

I was further intrigued: Could someone use a model like this to generate prose for, say, a startup that creates reports? Could it even write books? For years, I’ve toyed with the idea of a company that uses A.I. and machine learning (M.L.) to churn out an endless number of romance and pulp novels — had my ship finally arrived?

I tried out the model (and you can, too!), using some of the most iconic first lines in literature as a seed. Here’s the walkthrough:

The Experiment

Let’s start off with a little of Jane Austen’s “Pride and Prejudice.” How does the algorithm handle itself around the manners and marriage of Britain’s Regency era? (Austen’s original writing is in bold; everything after is the providence of A.I.)

It is a truth universally acknowledged, that a single man in possession of a good fortune

--

--

Nick Kolakowski
The Startup

Writer, editor, author of 'Maxine Unleashes Doomsday' and 'Boise Longpig Hunting Club.'