There’s really only one way to stop AI from taking over the world

4 min readAug 3, 2017

“Artificial Intelligence” “Machine Learning” “Robots” “Autonomous Driving” are all buzzwords of the 2017 era we live in. But it’s a double-edged sword. And increasingly, it looks like an inevitable double-edged sword.

This year, I attended DEFCON for the first time. And in the spirit of being a first-timer to DEFCON, I attended every single session I could make it to. What I learned about was a mix of interesting, thought-provoking, and overall insane. But my focus in this post is on that middle one — thought-provoking.

“Weaponizing Machine Learning” was probably the best Sunday afternoon talk at DEFCON. The speakers were fun to listen to, came with no expectations other than to chat about their lives, and made the most of their experience on stage. They injected bits of their own personality and peppered the talk with the occasional “fuck.” In fact, at one point, someone sitting close by me (near the front) asked something about the speaker’s girlfriend, and the speaker immediately took all his focus away from his PowerPoint and gave it to this heckler, saying, very curtly, “Uh, I’m married sir”, prompting roars from the audience at this poor man’s perceived rejection.

The talk centered on the researchers’ own predictions that penetration testing tools will be based on artificial intelligence in the coming year to two years. They even gave us a quick demo on how artificial intelligence works and what it takes to create artificial “learning”. This last part got me thinking.

The way artificial learning is done today is seriously flawed. A computer is set to sequence a series of events and search for a “reward”. Once it finds the “reward” a sort of memory is created and stored about how to get to that “reward” again. In slightly more technical speak, you want to increment a variable REWARD a certain amount once the program does what you want it to do. Create a loop. Run it infinitely with the goal of maximizing variable REWARD. But here is the problem that I think Musk foresees that no one else does:

As we reward efficiency, we forego the humanity factor. And as long as we do that, we build a stronger, smarter AI that just has absolute disregard for the things that matter to us humans. This will get particularly dangerous once AI learns to program new AI, but that is a topic for another discussion*. Let’s focus on the problem we can address at Level 1 AI.

Rewarding efficiency is, in all honesty, the easiest solution to the problem. But it’s not the cleanest. Let me paint an example of what I’m getting at: We all know the person who does super sketchy but “technically legal” things to gain an advantage over others who won’t go so far. Or a more social example, we all know the guy (or girl) who paints an impeccable picture on social media, and we know that that’s nowhere near how life is for this person. But somehow none of this person’s friends realize this. They just believe. How can they be so dumb? AI is currently at risk of taking this to new heights, and I’m not sure we can stop this.

Humans are built with excellent control systems that enable us to live with each other. Not “happily,” per se, but in a way that allows us to replicate without completely murdering one another. These systems are our emotions. Specifically, the emotions of fear, guilt, anxiety, ambition, happiness, and fulfillment. They help us build friendships and relationships, be nice to one another, and agree to punish others who step out of line. But unless we program “emotions” into AI, there’s nothing keeping AI from becoming the ideal sociopath.

The solution (at least for the near-term): We need to create a system of penalties for AI that steps out of line. We need to decrement the “reward” the AI receives when it cheats to get to a solution. In Facebook’s latest trial in creating chatbots that negotiate, the bots quickly learned some good strategies that you and I would see and call “sleazy” but not terrible illegitimate. For example, the bots learned if the feign interest/desire for an item and later “compromise” on it, they can leverage a higher reward from you. Elon Musk argues AI is a huge threat to humanity because we currently do not implement these punishments for AI. If the AI gained sentience with the efficient programming practices of today, we would be at risk of a life with robots that took advantage of their incredible lie-detecting, carefree, guilt-free living and live to maximize whatever their purpose in life is. Simply put: If we want a robot that lives to maximize its money, we need to either explicitly hardcode it to follow social norms or create an adaption mechanism that allows it to learn (AND FOLLOW) social norms — because these two are often in conflict, wouldn’t you agree? If we don’t, what is to prevent our AI from joining the ranks of the criminal underworld?

* Once AI learns to create new AI, we will need a Level 2 strategy to prevent the creation of even more efficient AI, which can be a greater danger than perfectly efficient Level 1 AI.

There’s really only one way to stop AI from taking over the world

Written by Pavel Sorkin