The Boogeyman Argument that Deep Learning will be Stopped by a Wall
I’m always seeking out arguments against my present beliefs (or models of reality). Gary Marcus wrote a new essay titled “Deep Learning: A Critical Appraisal” where he points out all the many flaws of Deep Learning. He has a vested interest in seeing Deep Learning fail, after all, he wrote a book in 2001, which he still is very proud of, that disparaged the nascent Artificial Neural Network research at the time. He writes:
To understand human cognition we need to understand how basic computational components are integrated into more complex devices- such as parsers, language acquisition devices, modules for recognizing objects, and so forth.
Marcus is very motivated to point out the lack of success of neural networks at every opportunity. His latest essay is one of his many attempts to claim higher understanding by means of criticism.
Nevertheless, let’s explore Marcus’ newest arguments because it may be valuable in pointing out flaws that we may have bias in noticing. Marcus enumerates the following flaws in present day Deep Learning:
Deep learning thus far is data hungry
Deep learning thus far is shallow and has limited capacity for transfer
Deep learning thus far has no natural way to deal with hierarchical structure
Deep learning thus far has struggled with open-ended inference
Deep learning thus far is not sufficiently transparent
Deep learning thus far has not been well integrated with prior knowledge
Deep learning thus far cannot inherently distinguish causation from correlation
Deep learning presumes a largely stable world, in ways that may be problematic
Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted
Deep learning thus far is difficult to engineer with
These are all valid arguments and they apply not only to Deep Learning but also any algorithm that gains knowledge from digesting data. These arguments apply to all machine learning algorithms. Just replace the phrase “Deep Learning” with “Machine Learning” and Marcus arguments will remain equally valid.
There is no deep insight here that any researcher in the Deep Learning field is unaware of. These are all known unknowns. What I mean by this is that we, the researchers, all know the flaws mentioned by Marcus and are currently seeking to discover new algorithms to fix these flaws.
Of course, I’m not the only researcher who is aware of the limitations of deep learning described in his essay. Marcus’ essay got some immediate responses in Twitter:
Of course, the key questions are, “Is Deep Learning flawed enough that it is the wrong approach to move forward? And if it is the wrong approach, then which among the other approaches is more promising?”
To his own credit, Marcus does make an effort to address these two questions.
To avoid being cast as the most known skeptic of Deep Learning, Marcus points out that Deep Learning is one of the many tools that may emerge. Presently, Marcus doesn’t say that Deep Learning is wrong (as he usually does), but takes a more conservative stance saying that it will be one of the useful tools in a toolbox of many other tools. This argument highlights the fundamental flaw of Marcus’ thesis since 2001. Being a cognitive psychologist he observes capabilities found in humans and then deduces that there are all kinds of cognitive machinery that needs to exist for each capability to work. However, he doesn’t have an explanation as to (1) how each kind of machinery works and (2) how these many kinds of machinery coordinate to get anything accomplished.
Where Marcus greatly erred is in his failure to comprehend that Deep Learning is the stepping stone tool that other cognitive tools will leverage to achieve higher levels of cognition. We’ve already seen this in DeepMind’s AlphaZero playing systems where conventional tree search is used in conjunction with Deep Learning. Deep Learning is the wheel of cognition. Just as the wheel enabled more effective transportation, so will Deep Learning achieve effective artificial intelligence.
We can have wheels made of stone, wheels crafted from wood and wheels with inflatable rubber tires, yet they are all round. There are of course alternatives to wheels for land transportation (i.e. skis, hovercrafts, maglevs and hyperloops) but few will have the practicality of the conventional wheel. Deep Learning are an instance of an intuition machine and intuition is the wheel for higher level cognition and not yet another tool. There is no other cognitive mechanism that we are aware of other than intuition than can give us general intelligence (GOFAI has failed us for decades because it assumed that rational cognition was the basis of intelligence). Marcus’ criticisms are analogous to saying wheels made of stone aren’t any good because they are difficult to create, aren’t perfectly round and don’t provide any cushion. However, the real problem is that the human mind is not an “Algebraic Mind” as the title of his book proclaims. Marcus will just have to get over himself and come to the realization that he’s been wrong since 2001. To build AGI you first work on intuition and then you work up the stack and not the other way around:
All of the innate cognitive machineries of the human brain are intuition based components. Unlike our digital computers, there are no logical components. Our rationality comes from learning through experience and is not some hardwired built-in machinery.
Over time, we will develop more advanced forms of Deep Learning. The learning algorithm will change from one that is meta-learning driven. The simplistic neurons will change into kinds with multiple thresholds and of more complexity. There’s really no looking back here. The methods of Deep Learning are being established and refined. Knowledge discovery requires search, and search has two extremes: exploration and exploitation. The solution of the future will of course be an algorithm that understands the best balance between the two.
From my own perspective, the path toward Artificial General Intelligence (AGI) is clear. I acknowledge all of the short comings that Marcus points out, however, without a doubt, the methodology and techniques being invented by the Deep Learning research community are slowing chipping away at the problem. To quote the stonecutter credo:
When nothing seems to help, I go and look at a stonecutter hammering away at his rock perhaps a hundred times without as much as a crack showing in it. Yet at the hundred and first blow it will split in two, and I know it was not that blow that did it, but all that had gone before.
The true game that is being played by Gary Marcus (which he successfully parlayed into an acquisition of his firm Geometric Intelligence by Uber) is in criticizing the dominant AI paradigm of today. By pointing out its flaws, he’s able to convince lesser knowledgeable investors of an alternative and perhaps more profitable path. Investors take great pride in having contrarian investment strategies. Investors, like most humans, would like to believe that their success was based on their own individuality and not just plain luck. In a majority of all investors, it just happens to be the latter.
Politics in science has always been present and it’s not going to disappear any time soon. We are familiar with the feud between Nicholai Tesla and Thomas Edison. Edison died a wealthy man, in stark contrast to Tesla who died penniless. Although, the scientific contributions of Tesla arguably surpass Edison’s, Edison is famous more today and Tesla is likely only well known because an electric car company is named after him. The Canadian conspirators have successfully parlayed their Deep Learning meme to great effect. Jurgen Schmidhuber was justified to have felt like Tesla in a world which seemed to have overlooked his own contributions.
The game is also played by DeepMind, if you read the AlphaGo Zero paper (the most significant development in AI since DL) you will find that DeepMind never uses the term “Deep Learning”. That is because they intend to change the narrative. DeepMind discovered something extremely significant that differs from Deep Learning in its original conception. Unfortunately, DeepMind has not figured an appropriate term for the “self-play” they discovered, so they awkwardly call it “Reinforcement Learning”. Yann LeCun is a little bit more savvy with branding, that’s why he came up with “Predictive Learning” to describe a yet to be discovered solution to unsupervised learning. (Note: A recent note by LeCun mentions that he wants to re-brand Deep Learning as Differential Programming)
We all strive not to be just another brick in the wall in the invention of “the last invention of man”. I just finished watching the AlphaGo film on Netflix. It’s amazing that DeepMind had the foresight to ensure that this event would be captured in film. The stars of this movie are of course Demis Hassibis and David Silver. However, just like how Stan Lee has a cameo role in every Marvel film, some grainy footage of Sergey Brin had to be spliced in. Eric Schmidt also had his own cameo role but it didn’t look contrived. ;-)
Has Deep Learning hit a wall? Very far from it. 2018 as predicted will be a banner year. The notion of a wall that will stop Deep Learning progress is at best a boogeyman argument that is not only imaginary but also misleading. You have the choice to agree with Marcus’ arguments and wait for some unknown that is supposedly better, or you can recognize the path to AGI has been clearer than it has ever been and use the methods that have lead to remarkable success in recent years.