Beyond Backpropagation: Can We Go Deeper Than Deep Learning?
Deep learning almost single-handedly made “AI” a buzzword in research labs, corporate board meetings, the stock market, and the homes of digital consumers. A branch of machine learning that uses artificial neural network structures to enable computers to learn, deep learning is responsible for recent performance leaps in technologies such as natural language processing, audio recognition, computer vision, autonomous vehicles, drug design, and advanced recommendation engines. The approach is also behind much of the tech world’s latest batch of milestone products: from Amazon Echo to Google’s AlphaGo to enterprise-ready machine learning APIs.
The Beginning (Or The End?) Of An AI Revolution
But, as the general public latches onto AI hype, the very pioneers behind deep learning are questioning whether it is the right approach to achieve true machine intelligence. Geoffrey Hinton, widely revered as the “godfather of deep learning”, now believes the current path dominating AI research leads to a dead end even though he’s championed this method for over three decades.
A direct descendant of the mathematician who invented Boolean algebra in 1854, Hinton can claim a similarly formidable imprint in the world of information technology. A cognitive psychologist with a doctorate in artificial intelligence, Hinton spearheaded hundreds of pioneering work, including hallmark studies in:
- 1986, when he co-authored a paper that elaborated on a deep learning technique called backpropagation
- 2006, when he and two other colleagues demonstrated the use of generalized backpropagation algorithm to train multilayer neural networks
- 2012, when his team’s backpropagation method easily beat other advanced systems in a large-scale image recognition competition
- 2017, when he proposed his much-anticipated capsule network architecture as an improvement to Yann LeCun’s convolutional neural networks
Backpropagation enables computers to learn by iteratively adjusting the weights of a neural network in order to minimize the error between the model’s prediction and a ground truth comparison. Prior to 2010, computing power was still too limited to sustain practical applications, but several years of technological advances eventually positioned Hinton’s backpropagation method at the center of almost every AI-driven marvel. Google strategically acquired Geoffrey Hinton’s company DNNresearch Inc in 2013 and hired him as a lead scientist for Google Brain.
Deep learning algorithms produce the most reliable results and economic value when used for “supervised” learning. In supervised learning, algorithms are given structured training data that maps inputs (such as an image) to a label (such as “cat”). Unfortunately, the vast majority of information in the world is not structured so neatly. Unlike neural networks, human infants learn concepts quickly in unstructured, unsupervised learning environments.
In an interview with Axios, Hinton suggested that we need to move beyond backpropagation if we want to teach computers to achieve unsupervised self-learning like that of human infants. “I don’t think it’s how the brain works. We clearly don’t need all the labeled data,” he declared, “My view is throw it all away and start again.” Weighing in on the future and his own contributions in the field, he humbly concluded, “The future depends on some graduate student who is deeply suspicious of everything I have said.”
Hinton’s comments rocked the deep learning community. Thought leaders reacted.
Developer, entrepreneur and author Siraj Raval made a video agreeing with Hinton’s assessment of backpropagation and covering alternative research directions that seem promising. “If we really want to get to general artificial intelligence, then we have to do something more complicated or something else entirely,” he explained, “It’s not just about stacking layers and then backpropagating some error gradient recursively. That’s not going to get us to [artificial] consciousness. That’s not going to get us to systems that learn a huge variety of tasks.”
New York-based writer and programmer James Somers concurs. Writing for the MIT Technology Review, he argued that “once you understand the story of backprop, you’ll start to understand the current moment in AI, and in particular the fact that maybe we’re not actually at the beginning of a revolution. Maybe we’re at the end of one.”
Moving Beyond Supervised Deep Learning
Many researchers believe that to simulate human intelligence, we need to strengthen our capabilities in unsupervised learning and find innovative ways to train models without strict dependence on structured training data. Promising approaches include:
Autoencoders are neural networks that work in an unsupervised manner by setting outputs as identical to inputs. This enables the network to learn interesting structures and representations of the data that can be used for denoising, compression, and generation.
Generative adversarial networks (GAN) are algorithms that use two neural networks, a discriminator and a generator, to compete with each other to produce high quality generated results. The generator is responsible for creating content (like images) from scratch while the discriminator is trained on real world data and must distinguish between true images and those created by the generator. NVIDIA recently used GANs successfully to generate highly realistic images of human faces.
Differentiable neural computers (DNC) are neural networks that learn to form and use complex memory structures like lists, trees, and graphs from scratch. The core of the neural network is called a controller, which operates like a processor in a regular computer by taking in input, reading and writing from memory, and producing output. As a DNC is trained, the controller learns to produce increasingly better answers through the use of the right memory structures.
Genetic Programming & Evolutionary Strategies (ES) are a decades-old optimization technique used to simulate natural evolution without the use of backpropagation. After all, evolution is the only known strategy that is proven to generate at least human-level intelligence. In evolutionary strategies, a set of candidate solutions is provided for a given problem and evaluated according to a “fitness” function. The best performing solutions “survive” and the poorly performing ones “die” while increasingly better solutions are created in the next generation. Google Brain researcher David Ha explains that this approach can be an alternative approach for the “many problems where the backpropagation algorithm cannot be used” in his excellent visual introduction to evolutionary strategies.
Neuroevolution is a subfield of deep learning that uses evolutionary strategies to evolve the weights of a neural network rather than using an error-based cost function and the backpropagation method. One of the best known algorithms is NEAT (NeuroEvolution of Augmenting Topologies), developed by Kenneth Stanley in 2002. Since then, many new neuroevolution algorithms have been developed.
Probabilistic programming enables us to create learning systems that make decisions in the face of uncertainty by making inferences from prior knowledge. According to Avi Pfeffer in his book Practical Probabilistic Programming, you first create a model that captures knowledge of your domain in quantitative, probabilistic terms, then apply this model to specific evidence to generate an answer to a query. This process is called inference.
AI-specific hardware may also play a key role in bridging the gap towards general artificial intelligence. Current transistor-based hardware supports serial processing. In contrast, the neurons in animal brains work in parallel to each other. Examples of hardware explicitly designed for artificial intelligence include Google’s Tensor Processing Unit (TPU), IBM’s neuromorphic chips, and optical computing developed to speed up matrix operations.
Will any of these approaches prove to be the real breakthrough to human-level machine intelligence? Nobody really knows, but Hinton’s provocative statements have definitely accelerated the race to push the cutting edge of AI research beyond the limits of deep learning.