Why Deep Learning Experimentalists Should Not Monopolize the Conversation About AGI
Many dislike the deep learning community because it’s inhabited by bro culture and excessively analytic types. Not everyone in this community is like this. @geoffreyhinton and Bengio are good examples of wholesome researchers. Yet, it’s obvious that it’s infested with toxicity.
However, I’m excited about this field because it reveals a path that could lead to a richer and more diverse humanistic philosophy. It’s quite ironic that in our search to understand general intelligence, we trigger a deeper understanding of what makes us human.
Analytic philosophers historically inhabit the philosophy of mind. Bertrand Russell and “early” Wittgensteins believed in the limitless power of logic. But they hit a wall when they study general intelligence. It’s the same wall the GOFAI folks have hit.
Keneth Stanley @kenneth0stanley and Joel Lehman @joelbot3000 are both successful deep-learning researchers. They wrote a book about openendeness that renders a glimpse of the humanity that exists in a messier reality (see: https://www.amazon.com/Why-Greatness-Cannot-Planned-Objective/dp/3319155237/ref=sr_1_1)
As you study general intelligence, you start to realize it’s orthogonal to formality and predictability. That it’s inextricable from living things and that autonomy is orthogonal to brute for computation.
In our midst is an endless variety of general intelligence occupying an endless diversity of niches. Conscious minds are a consequence of living in this world (see: https://www.amazon.com/Metazoa-Animal-Minds-Birth-Consciousness/dp/000832123X/ref=sr_1_1)
To understand general intelligence, one needs to first grasp how living beings are not things but rather processes. Processes that discover meaning to the existence of their identity. This introduces us to process metaphysics:
This leads us to a path of phenomenology philosophy to cybernetics, ecological psychology, and enactivist 4E psychology. When we stud complex adaptive systems, we are forced to develop an alternate language decoupled from classical physics's explanatory form.
Decades ago, software engineering was forced to challenge the assumed rigidity of its development processes. This led to the use of pattern languages and agile methodologies. This is because software development is complex enough that it doesn’t work like a factory floor.
The humanities were introduced into the software field. Although there is a conceptual connection between cybernetics and deep learning, the deep learning community originates from elsewhere. The community traces its origins to fields developing numerical tools of prediction.
In the late 1950s, as more researchers become familiar with computer technology, cybernetics was pushed aside in favor of exploiting the symbolic processing of computers. Hence for decades, GOFAI ruled in the agendas of AI research.
But even before realizing the utility of computers for symbol manipulation, computers were harnessed for numerical computation. Hence the very first computers were tasked with the simulation of nuclear processes.
An entire field (somewhat distinct from traditional computer science) emerged (i.e., computational science). Many areas in this field span elementary particle physics up to economic modeling. Their methods differ from methods found in GOFAI.
Computational methods never aspired toward the development of AI. Researchers were content with creating algorithms that led to better predictions. Connectionism has always been a challenger to symbolism in AI. So as deep learning emerged, GOFAI was challenged again.
There’s a difference between fields striving to uncover the big picture and fields satisfied with incrementally improving their methods. DL fields fused the big picture connectionists with the numerical methods tinkerers. The latter blindly pursues leaderboards.
This effect is that few in the community pursue the big picture. That’s because climbing up the leaderboards depends on mathematical, programming, and engineering skills. You can’t have ideas implemented without people skilled in extracting computation out of existing hardware.
The success of deep learning is a consequence of the emergence of GPU hardware and crowdsourced data from the internet. Deep learning has architectural features that exploit this that other numerical methods could not.
Richard Feynman, in the 1980s, consulted for a company called Thinking Machines. It was a company that sought to exploit massive parallelism to perform physics simulation and perhaps eventually lead to AI (see: Thinking Machines Corporation — Wikipedia )
It thus took almost 30 years before this idea of massive parallelism would be seriously linked to the notion of Artificial Intelligence. Science requires technology to mature for certain new truths to emerge.
But now that we have specialized massively parallel computational systems at our disposal, we are discovered new computational truths. These truths are fundamental and thus applicable to a much wider domain than the pursuit of AI alone.
The emergence of computers led to the formulation of the Church-Turing thesis (aka computability theory). Surprisingly, it’s a theory not just about human-invented computers but a theory that is universally true for all of reality (physics included).
But now that we have massively parallel computers and a very odd way of “programming” them, we are uncovering new truths emerging from huge computational workloads. We have emergent behavior that we see in GPT-3. These are supercolliders in the computational regime.
But here is the problem with many deep learning researchers. They are analogous to the experimental physics talent that constructs supercolliders. Theorists don’t have the skill to do their work. Yet they also assume themselves to be the same as theoretical physicists!
The division between theorists and experimentalists may be unique in physics. There’s no such thing as theoretical neuroscientists. Some people formulate theories of the mind, but these people are usually placed in the bucket for philosophers.
There are theoretical biologists. Perhaps it’s an indicator of the maturity of a field when it can support the split between theorists and experimentalists. Deep Learning is simply too new that the split between the two has yet to occur.
I see the field progressing. Just as computers led to both symbolic and numerical algorithms, DL will progress to pursue AI and specialized scientific domains. Similar to computational science, there will be specialization in different scientific fields (see: Deep Learning is Splitting into Two Divergent Paths ) To conclude, the advances of DL go beyond just the development of automated cognition. They lead to a new way of thinking about general intelligence. A new way of understanding massive computation. Finally, new tools to parse complexity in other domains.
Thus it is entirely a disservice to the predominant bro culture of the experimentalist community to bully its world-view on other scientific domains. Talking points like “The Bitter Lesson,” “Reward is Enough,” and “Scale is all you need” are expressions of monopolization of the thought process. It’s an ugly manifestation of misplaced hubris.
Deep learning has a culture where you cannot have an opinion about aerodynamics because you aren’t a Formula 1 mechanic. It’s toxic reductionism.