Will Robots Turn Us All Into Paperclips?

Published in

The Coffeelicious

9 min readNov 23, 2016

The possible creation of Artificial Intelligence is the ultimate double-edged sword. One side of the blade will have the ability to cut away disease, suffering and perhaps death. The other side has the capability to annihilate us completely. There is a growing movement of people who think an extinction event is likely to be the default outcome, unless we look seriously into the problem of how to develop ‘Friendly AI’. These people are not apocalyptic yahoos shouting from street corners, wearing sandwich boards with “the end is nigh” scrawled on them. They are respected businessmen, AI developers, futurists and oxford professors at the top of their fields.

But, before you start to panic, take a deep breath. Everything is going to be okay - Extinction is at least five years away.

Before delving into the sensationalism of destruction, it’s worth clarifying what is actually meant by ‘Artificial Intelligence’ in this context. Many forms of AI already exist. The program Deep Blue can beat grandmasters at chess, advanced algorithms can predict weather events and Siri can tell us how to find a cheap Indian restaurant close by. The difference about all of these applications is that they’re very task specific. They aren’t creative or general in scope. The true benchmark of what some might refer to as ‘The Singularity’ or Artificial General Intelligence (AGI) is that a machine will match — or surpass — human ability at a wide range of tasks. In his book Superintelligence, Professor Nick Bostrom lists things like strategizing, social manipulation, hacking, research ability and economic productivity as just some of the powers an AGI would pack in its intellectual arsenal. In short, a ‘true’ AI would be like one of those incredibly annoying people that are just good at everything, only better.

Beyond human intelligence is so dangerous to us because our own ability to think gives us an advantage over other animals. The survival of most other animals rests on our will to help them (or allow them) to continue to thrive in ‘our world’. We have become the caretakers of earth. Think about what would happen if a new species came into being that was smarter than us. What would happen to us if that species didn’t care about our survival? What if that species saw us as a genuine threat to its survival? By handing over our advantage to these creatures, we are putting the asset that protects us most into tentacles we may not control. The chilling thing to realise is that a leap in Artificial Intelligence is likely to achieve an IQ gap that would make the difference between Einstein and Brick Tamland seem small. Should a superintelligence decide it didn’t want us around, there would be no Terminator or Matrix-like war, where we struggle for survival against the machines. We don’t go to war with flies landing on our food. We swat them.

The Terminator would be a pussycat compared to a malignant AI in real life.

It’s not hard to imagine how things could escalate quickly when it comes to an intelligence explosion. Say we develop whole brain emulation (essentially a human brain being successfully uploaded onto a computer). To start, this ‘brain’ is hooked up to the Internet and consumes the entire sum of human knowledge in a few minutes. Uninhibited by the fleshy wetware it used to think with, the emulation is also 100,000 times faster at processing information. This means a millennium of critical thinking can happen in under four days. What kind of things do you think it would come up with uninhibited by time? Think about what technological breakthroughs we’ve been able to achieve in the last thousand years. According to Eliezer Yudkowsky from the Machine Intelligence Research Institute: “It would be physically possible to build a brain that computed a million times as fast as a human brain, without shrinking the size, or running at lower temperatures, or invoking reversible computing or quantum computing. If a human mind were thus accelerated, a subjective year of thinking would be accomplished for every 31 physical seconds in the outside world, and a millennium would fly by in eight and a half hours”.

Interestingly, whole brain emulation and speed intelligence are regarded as weaker forms of Superintelligence. Self-improving machine intelligence, which has the ability to iron out bugs in its software, bootstrap to other technologies, and write new code to even further optimize its intelligence, has the potential to shoot beyond the stratosphere of what we can even comprehend. To us, what this kind of AI could achieve would seem like magic; similar to how harnessing the power of nuclear energy would look to prehistoric man. It’s easy to imagine how a seemingly slow development process could exponentially multiply to become a detonation of growth. A ‘fast take off’ could happen in as little as days, hours or minutes. There would be scant warning before an AI could form a decisive strategic advantage, seize control of the world and shut down any competing intelligence systems, which may threaten its dominance. If this is a genuine risk, then it would seem important to make sure the intelligence has our best interests at heart. There are, of course, steps we can take to slow down the likelihood of a fast take off. Government monitoring, public policy, restriction of hardware availability to burgeoning AIs and other ‘boxing’ techniques to stifle the growth of any systems being built, can all help even the playing field. This would mean competing projects each have the chance of launching at similar times. The benefit here is that if one system malfunctions, the other AIs could be used to shut it down and protect our interests. Still, in a competitive world like ours, where there is huge potential gain for people being first to reach the creation of AI, do we really think that developers are going to slow down unless they have a true understanding of the risks? A lone hacker theoretically has the computing capability to produce an AI system using normal, modern-day hardware. This ‘lone genius’ scenario is not off the table to produce an intelligence explosion that could set fire to the world. The reality is there are companies like Google pouring millions into AI research with little focus on making it friendly. While acknowledging the risks, they are not addressing them. At this point the Machine Intelligence Research Institute in California, Open AI and the Future of Humanity Institute at Oxford are earnestly working towards theoretical answers. None of them are close to a technical solution, but the horse needs to become before the superintelligent cart.

Another risk in creating AI is that many assume it will automatically want to achieve goals we think are good, in ways we want those goals to be achieved. But, just because we think chocolate mousse is great, doesn’t mean an AI will start churning out perfect chocolate mousses, or cure AIDS, or terminate Justin Bieber. It stands to reason that any machine we design would not actually want anything, unless we program it with goals in the first place. It would not have any moral values that we do not program in there either. An extension of that reasoning would lead us to believe that to solve this problem we simply instil values and goals that we know are good. Glossing over the serious issue of whether or not we have a perfect morality to benchmark (we don’t), or whether we know what good goals consist of (we don’t), there is always the issue of a machine perverting our intent, to fulfil even basic goals in horrible ways. Yudkowsky, developing an idea of Bostrom’s, illustrates this particularly well when he talks about an AI tasked to maximise the building of paperclips. The AI designed to maximise the production of paperclips would initially work to improve its intelligence, to optimize its power to build them. It would do this, because improving its intelligence means it will meet its primary goal faster and better. This would result in an intelligence explosion that would give the AI a capability to convert most of the matter in the solar system into paperclips. Unfortunately, we are part of the solar system and are a resource the AI might want to use: “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else… This may seem more like super-stupidity than super-intelligence. For humans, it would indeed be stupidity, as it would constitute failure to fulfil many of our important terminal values, such as life, love, and variety. The AI won’t revise or otherwise change its goals, since changing its goals would result in fewer paperclips being made in the future, and that opposes its current goal. It has one simple goal of maximizing the number of paperclips; human life, learning, joy, and so on are not specified as goals. An AI is simply an optimization process — a goal-seeker, a utility-function-maximizer. Its values can be completely alien to ours. If its utility function is to maximize paperclips, then it will do exactly that.” Another perversion of a goal we genuinely want, is happiness. Unfortunately, planting electrodes in our brains that stimulate the release of endorphins, or just replicating “smiling faces” by paralysing our face muscles into permanent joker-like grins can bring about physical manifestations of ‘happiness’. These are not just extreme examples meant to scare you. They are also there to show how seemingly harmless intent can result in great wrong. The good news is that if we do manage to get the goals and values right, it is unlikely a machine will overthrow those values for selfish interests, as some might suppose. We can program a machine to place more value on human life than its own survival. It stands to reason that if this principal value is part of the machine’s nature it won’t want to change, much as Yudkowsky says he wouldn’t take a pill that he knew would turn him into a murderer — He isn’t wired to want that in the first place.

Fortunately there are minds like Bostrom, Yudkowsky and Elon Musk’s team working in earnest to figure out how we might be able to solve the goal/value issue. Their number one aim is to formulate pathways to Friendly AI. The question remains: will they get there before less-cautious scientists deliver a technical solution to produce an unfriendly Singularity? Or will an intelligence explosion turn us all into paperclips? And, when will this happen? Ray Kurzweil, AI expert, futurist and director of engineering at Google is confidently predicting a 2029 result, but thinks it may be as close as 5 years.

The race is well and truly on. As the finish line draws nearer, we can only hope that an Artificial Intelligence is built with the capability to help us avert other global catastrophic risks, like climate change, pandemic disease, super-volcano eruptions and the return of disco music. It would be the cruellest of ironies that a machine we build to save the world, and make our lives better, is in the end the cause of our destruction.

Author’s note:

The above is a highly simplified account of a complex and important issue. If you’re interested in further reading, I recommend Superintelligence by Nick Bostrom, Our Final Invention by James Barrat and anything by Eliezer Yudkowsky, who is particularly clear — and entertaining — in his arguments. A lot of the ideas in the above article come from these writers.

If you would like to receive updates about my writing, including more stories like this in your inbox, sign up to my newsletter here.

Will Robots Turn Us All Into Paperclips?

Written by Tim Hawken