Why deep neural networks don’t actually think
Neural networks continue to solve ever more challenging problems, but will never think as long as we used a “layered” approach
It’s 2019 now and we have machines that can
- Beat the best humans in the ancient game of Go
- Get ever more impressive scores in games like Flappy Bird
- Guide cars to drive themselves
- Interpret sentiment from text
- Create images of faces of people that have never existed before
- … and the list go on
But the one thing that machines fail to do is…. think. Let’s talk about why.
Popular Science and Fiction
Magazines continue to draw a link between human and artificial intelligence which blurs our common understanding of what’s capable. Headlines continue to allude to the “fact” that machines “think” or are somehow imbued with the ability to act in a sentient manner.
We continue to conjure up images of “Rosy” from the Jetsons or even HAL from “2001: A Space Odyssey.” It’s a lot more fun to think of robots that can think than to realize the cold hard truth that they (currently) can’t. It may seem like semantics, but let’s delve into how the basic mechanisms of these artificially “intelligent” systems work.
You’ll sometimes hear how Artificial Neural Networks are “biologically inspired.” We have neurons in our brains (over 100 billion!) and they connect to each other in many interesting ways. In fact there are over 100 trillion neural connections (also called synapses) in the average human brain.
To put that into perspective, that’s over 1,000 times the number of stars in our galaxy.
These neurons are exceedingly simple in that all they do is receive a small electrical impulse from one or more neurons and pass along an electrical impulse to one or more other neurons via the synapses. That’s it. That’s how all of your thoughts, emotions, memories, actions, decisions and regrets happen — tiny electrical impulses through neurons that decide whether to (and the amount of) send a signal to other neurons.
This is a REAL neural network. Your brain has actual neurons that connect (in non linear ways) to each other in all sorts of fantastic arrangements. A single neuron can be connected to either one neuron or thousands — the possibilities are endless.
Artificial Neural Networks (ANNs)
Almost all of the incredible advancements in machine learning as of late are due to the creation of the “Artificial Neural Network,” which is an attempt, albeit an extremely simple one, to model how the brain works. All of these fantastic feats are achieved with the help of a technology that was first conceived of in the 1940’s. That’s right — all of our modern advances in machine learning and artificial intelligence are because of an 80 year old advancement. It was further refined in the 1970’s with the advent of “back propagation.” This magical innovation is what gives machines the apparent ability to “think.”
Some 8th Grade Math
Before we understand how ANNs work, let’s revisit some basic algebra. At it’s core, algebra is solving equations with unknowns (called “variables.”) A “system of equations” is when you have multiple equations and multiple variables and you need to solve for those variables. Something like:
If you have two variables, in this case x and y, you need at least 2 equations to figure out what those variables are. I won’t force you to go through the math, but in this case x is 3 and y is -1/9. If you had only one of those equations, say 8x + 9y = 23, well there’s a lot of numbers that can satisfy that. If x = 0, then y would have to be 2.555555. If x=1, then y would be 1.66666666. And it goes on infinitely from there. If we had only one equation and 2 variables, there are literally an infinite amount of possible solutions.
It’s only by adding the other equation that we have introduced some bounds. But let’s expand it a bit, and instead of having 2 variables and 2 equations, we have 1,000 variables and 100 equations. We wouldn’t be able to accurately solve but we could probably get pretty close.
That’s all an ANN is doing. It’s trying to come up with numbers for the (potentially) tens of thousands of variables and since it can’t come up with numbers that DEFINITELY work, it’s finding numbers that give the fewest errors.
We’re about to get into the architecture of ANNs, and the math can get pretty intense, but for our purposes, the “simple system of equations” model will suffice.
Neural Network Model
The diagram in Figure 4 outlines a basic ANN — there are 2 inputs, 2 hidden units and 1 output. Each circle is a “neuron,” and has a certain number of weights and a bias associated with it. You can think of these artificial neurons as being very similar to the real neurons in your brain. They receive a signal from the previous neuron(s) and decide whether or not to pass a signal on to other neurons. In the model above, the signals are flowing from left to right. Through the “system of equations,” that we discussed above, the weights and biases adjust continually to match the data being fed.
While the ANN in Figure 4 can only solve relatively simple problems, the one shown in Figure 5 is suitable for much more complex problems — like identifying a cat in a picture or beating a human opponent in chess.
Given some series of inputs (and inputs can be a great many things) like the current position of a chess board, the network will “learn” what’s the best move for it to make to beat an opponent.
When I say “learn,” remember — all it’s doing is adjusting the weights so each neuron will fire with certain strengths and only for certain inputs. We feed the network thousands of examples of chess games (or whatever the domain is) and give it examples of “success.”
There’s a lot more to Artificial Neural Networks, but the important thing to understand here is that the data is flowing from the inputs, through some hidden layers to the outputs. We give it thousands of “training examples” of what we want it to “learn,” and all it’s doing is adjusting the weights by basically solving HUGE systems of equations.
Lack of actual problem solving
I’ve just given the laziest and least math-filled explanation of how artificial neural network works and it’s proven that it’s capable of doing some pretty amazing things, but it’s not ACTUAL learning or thought. Let’s go back to our good old fashioned human brains. The connections are NOT linear. They do NOT go from “left to right.” The synapses aren’t connected in some structured, easy-to-follow pattern that fit nicely in a diagram.
Thought — REAL thought — is something that we don’t fully understand, but I think we clearly understand what it isn’t.
The system that beat the best human players in the ancient game of Go was REALLY good at… playing Go. That’s it. It isn’t able to take that incredible expertise and apply it to ANY other problem.This is why our best learning machines don’t actually think.
Our brains adapt and change continuously. Our brain cells don’t seem to be special purpose unlike ANNs, which are trained to do ONE task. The same neuron that’s activated when you cry might also be activated when you lie while playing poker.
Can machines EVER think?
This is a question that people have been trying to answer since before there were machines. In my humble opinion, in order to have true thought, we need to move away from specially trained ANNs and consider new models of machine learning. Perhaps instead of the core unit being layers of neurons in a forward / backward arrangement, it can be something a bit more sporadic. It should take into account that sometimes neurons die off and the core learning / thought processes won’t change.
I don’t know what the future of thinking machines will be like. But I think it’s certainly interesting to think about.
To read more about deep learning, please visit my publication, Shamoon Siddiqui’s Shallow Thoughts About Deep Learning.