A beginner’s guide to artificial neural networks
At the end of 2016, it’s hard to escape the cultural influence of artificial intelligence. The biggest company in the world is publicly staking its future on it; Seoul came to a standstill to watch their best Go player lose to it; and millions of people have tuned in to see what HBO think it might mean for the future of humanity.
To understand its implications — and they’re big — it’s helpful to understand what it is and how it works.
Machine learning vs. artificial intelligence
People tend to use machine learning and artificial intelligence as if they’re interchangeable, but they’re not: rather, machine learning (ML) is one very successful approach to the broader field of artificial intelligence (AI).
Artificial intelligence is, essentially, the simulation of intelligence in computers. As such, it’s something that people have been working on for decades; the field of AI research was founded at a conference at Dartmouth in 1956. It encompasses a range of approaches, which range from the simple to the incredibly complex. If you encode a set of simple rules that let a computer never lose at tic-tac-toe, that’s a basic form of artificial intelligence.
Machine learning, on the other hand, is a form of artificial intelligence in which the computer learns for itself how to complete a task. And it’s this that has been at the heart of many of the recent huge developments in the field of AI, which might explain why interest in it has grown so much over the last few years:
How do machines learn?
There are a number of different ways to get machines to learn for themselves. I’m going to focus on one — deep learning — because it’s led to a lot of big, recent breakthroughs (it’s what Google’s DeepMind used in AlphaGo, for instance) and, as such, it’s probably the method people are most excited about right now.
At deep learning’s core are things called artificial neural networks. The thinking behind these neural networks (I’m dropping the artificial for brevity) is, broadly, this:
By far the best learning system we’ve ever encountered is the human brain, so let’s try to get computers to mimic the way the human brain learns.
Neural networks are this attempt — essentially, they’re meant to be mathematical representations of the way the human brain operates. And they’re pretty amazingly effective.
Artificial neural networks
A neural network is essentially a series of units (modelled after the neurons in the human brain) and the connections between them (modelled after synapses).
To illustrate how they work, I’m going to explain how a neural network like the one above can be used to train a computer to tell you whether or not a black-and-white image it’s shown has a cat in it.
The column of units on the left is the input layer; you can set these up to represent some input you want to pass the neural network (in our case a black-and-white image). To do that, you make each unit in the input layer correspond to a pixel in the image, with ‘on’ representing a black pixel and ‘off’ representing a white pixel. That way, the input layer can ‘see’ the image — its units can represent the image in its entirety.
The units in the input layer, when they ‘see’ that image, trigger the connections to the next layer of units. Each of these connections has some weighting, with some stronger than others. Each of the units in that next layer (the ones in the middle column) triggers its own connections if the combined weightings of all the triggered connections coming into that unit cross a certain threshold. What this means is that two things — (i) the image the network is shown and (ii) what exactly all the weightings between its units are set to — create a complex chain reaction of triggered connections that works its way through the network (from left to right) and that directly effects which connections into the final layer are triggered.
This final layer — the single unit out on the right — is the output layer. Like the units in the middle layer, it takes the combined weightings of all the triggered connections coming into it and — in our case — if these are above a certain threshold, we treat this as an output of ‘yes’ from the network (with a ‘no’ given by a sum below that threshold). In this way, the network gives us an answer to the question, ‘Does this image contain a cat?’
The first time this whole process takes place, that answer is likely to be wrong. The weightings of the connections between the layers are essentially random, which means the output is essentially random; the network can’t successfully tell you whether or not the image features a cat.
But — and this is the clever part — if you tell the network whether it was right or wrong, it can go and change the weightings of the connections between its units (using a technique called backpropagation), in an attempt to get closer to the right answer. You do this enough times — show it an image, have the network output an answer, tell it whether it was correct and have it alter its weightings — and, gradually, it gets better and better at the task at hand. In this case, it learns the ability to tell you whether or not an image you show it has a cat in it. And this same technique can be used for a whole range of tasks, from translating speech to composing music.
Side note: Why is it called ‘deep’ learning?
‘Deep’ learning systems are really just neural networks in which there are lots of layers between the input layer and the output layer.
Why is this important?
Machine learning — and AI in general — has been around for a while. But it’s recently started accelerating at a rate that’s surprised a lot of people.
As recently as 2014, most experts thought it would be 10 years before a machine beat the world’s best players at Go. DeepMind proved them wrong. It’s becoming increasingly apparent that many tasks we once thought would be the domain of humans alone for the foreseeable future — if not forever — will be accomplished by machine learning systems much sooner than expected.
This is going to have a huge effect on politics, the economy, and society as a whole. Entire industries will be automated, leaving millions of people out of work and unable to retrain fast enough to stay ahead of ever-faster technological improvements.
This, in turn, is likely to lead to discontent on a scale never seen before. The political upheavals of 2016 are an early sign of this — the populist movements that have circled the globe, which tend towards isolationism and a rejection of the huge societal changes of the last few decades, have their roots in the communities most immediately threatened by mass job automation.
This dissatisfaction is only going to increase as machine learning systems become more and more capable. And if you think there’s a limit to what these machine learning systems can accomplish, bear in mind that the majority of AI experts think AI will be able to accomplish any intellectual task humans can perform by 2050.
It’s worth saying that AI is — if handled right — going to bring huge benefits to humanity as a whole. A world of machines that can work tirelessly, innovate, and improve themselves, is a world in which advances in economic efficiency make everything that came before look like the Dark Ages.
But the path to this revolution in economic efficiency will be strewn with defunct industries and discarded drafts — full of underestimates — of what we as a species think is technologically possible. Machine learning — the system described above, and others like it — is going to change the ways humans operate more than any technology that’s ever existed.