The False Equivalence of Categorization as Thought
The most glaring flaw of Deep Learning is masked because our entire society has been indoctrinated into an ontology based on substance metaphysics. We assume a feature about cognition that is simply non-existent but we think is real because of our metaphysics. Let me explain!
The value of deep learning became apparent in 2012 when it began topping the leaderboards in image classification benchmarks. The proof of its usefulness was a consequence of its ability to perform image categorization.
This got everyone excited because categorization is thought of as being the core of cognition. This is a category error. We believe categorization is at the core because our philosophy is based on things. It’s substance based. It is not based on processes and relations.
The limitations of deep learning to accurately categorize its observations were revealed when adversarial attacks were discovered. This problem to this date has never been fixed. It perhaps never will because it’s fundamentally impossible. But many cannot see why it’s impossible.
This is because general intelligence is not based on categorization. Solving the categorization problem does not lead to general intelligence. In fact, the problem cannot be solved. The problem is an illusion that is generated as a consequence of our substance metaphysics.
Our messy world cannot be neatly arranged into hierarchical categories. This explains why we’ve all but abandoned ways to organize our bookmarks. It takes more work to maintain ontologies than we can extract utility out of them.
How we measure progress in deep learning is very peculiar. We often delude ourselves that we can discover an objective mathematical measure that can indicate progress. Absent this, we base our measure on a finite data set as what benchmarks do.
This is also broken because DL algorithms learn to game the measure (see: Goodhart’s law). It’s often claimed that adversarial attacks are a consequence of flawed solutions that are discovered that satisfy the rewards set in training. An error in the robustness of our objectives.
We end up with a never-ending game of specifying objectives that are continuously circumvented by deep learning systems that are always seeking the easiest path to gaming the metrics. Our delusion has become endemic that we believe in the gospel “Reward is Enough”.
We also fail to realize that deep learning systems create solutions for induction and fluency that are different from humans. We have fiction that because these systems are capable, they must be doing what humans do. Deep learning artificial intuition is not like human intuition.
Today we have next-level systems based on transformer and diffusion architectures. These are generative models that can generate text and images that appear to have fluency that is difficult for humans to accept. We are seeing real signs of panic about the rapid emergence of AGI!
Unlike the convolution networks of the past, the capabilities of these generative systems were not discovered through a categorization leaderboard. Rather, they were discovered by humans “eye-balling” the results. They perform beyond our ability to invent a mathematical measure!
Generative systems avoid the categorization problem because they simply don’t attempt to do it. These language models exploit the statistical measures embedded in our language use. These systems discover the regularities found in human discourse.
As a side-effect, generative language systems are capable of conceptual blending. Diffusion architectures are the next big thing. I will argue though that we might have not realized its use without the prior invention of large language models.
When you play around with tools like Dall-E, you begin to get a sense of the composability of concepts. Human cognition infers concepts like “chair-ness” and “avocado-ness” such that we have a sense of the appropriate kind of blending.
Conceptual blending is fundamental to human thought. The current panic about AGI emergence is that language, and diffusion models are mimicking our cognitive ability to perform this. It’s freaking everyone out!
Human cognition does not revolve around categorization, rather, it’s based on just-in-time conceptual blending. Douglas Hofstadter identified this as analogy-making.
Deep Learning has bootstrapped its capabilities by exploiting human-generated artifacts found in text and images. But we should not be fooled by this mimicry. It is only replicating our habits. It is not replicating human cognition.
Neural network classifiers cannot explain why they infer a classification. They may distinguish a dog from a cat, but they can’t explain why. It’s what Daniel Dennett calls competence without comprehension.
How deep generators like GPT-3 or Dall-E can explain is still up for debate. There appear to be glimpses of explanatory capability in both systems. But it is like split-brain patients might have their left brain conjure up explanations disconnected from reality.
We should not mistake fluency for competence. Our civilization has indoctrinated us to value verbal competence over actual understanding. It is what @pmarca notes about the domination of “wordcels” over “shape rotators.”
What then comes after Artificial Fluency, not that it already exists in our midst? It is what can bridge the gap between autonomous intelligence and human-level abduction that is missing. We are very from either capability.
Perhaps it’s because centuries ago, civilization favored the utility of substance metaphysics to govern its population. We are on the wrong path. Perhaps we should have instead listened to the process philosophers. Maybe it’s time we should.