The John Wick Theorem of Artificial Superintelligence

A response to the Lebowski Theorem

Hein de Haan
The Startup


Hacking your reward function? Photo by Chase Fade on Unsplash

What is the Lebowski Theorem? A reference to the lazy “The Dude” in The Big Lebowski, this idea was thought up by Joscha Bach and tweeted out in April of 2018:

Now, this is a very interesting thought. First, let me explain the background. The idea is that we have an Artificial Intelligence (AI) that’s significantly smarter than any human is, making it an ASI: Artificial Superintelligence or, as Joscha calls it, a superintelligent AI. Such an ASI has a goal to strive towards: maybe it was built to invent a cure for cancer or death itself. If its goal is to invent a cure for cancer, its reward function would reward future scenarios where the ASI invented a cure for cancer higher than those without a cure for cancer. That’s pretty much what a reward function is: the Connect Four playing AI I once built had a reward function that gave high rewards to winning positions because the goal of the AI was to win at Connect Four.



Hein de Haan
The Startup

As a science communicator, I approach scientific topics using paradoxes. My journey was made possible by a generous grant from MIRI (