Why build a “dumb” AGI system?

Can you make your AI as dumb as Homer?

As of now, building a highly capable, smart AGI system is out of our reach. Three main reasons -

Computing capabilities: The only known highly capable general intelligence system we know of is the human brain. It is estimated that the human brain operates at about 10¹⁸ [1] FLOPs. According to Ray Kurzweil [2], We do not expect to have that kind of computing capability available till mid 2030’s. So today’s development of AGI should not focus on the exhibition of human level intelligence. We may develop algorithms that produce similar capabilities at lower FLOPs, but as of now we would still be several orders of magnitude short. As of now, any algorithms that show human-level results on specific tasks are unlikely to be stepping stones to general intelligence.

No focus on AGI and the bias for commercialization (narrow AI): The AI community has been enamored by the lucrative possibilities of solving a certain class of problems. The results are creative, impressive and useful and will influence (good and bad) the lives of billions of people. But, this bias for commercialization is taking the research community’s focus away from the goal of building AGI. While the AI community is focused on building highly capable algorithms for specific problems, AGI community needs to get funded and be able to focus on the bare-bones solutions for general AI, as dumb as they may look compared to narrow AI solutions.

Hyper-parameters: For a human brain to develop properly, appropriate values for certain hyper-parameters[3] are critical, especially during development. It is possible that eventually we will find good AGI algorithms and have enough computing capabilities, but if the hyper-parameter values are suboptimal, the resulting system will not look very intelligent. Given the subspace of “good” parameter values is expected to be tiny compared to possible values, early attempts are likely to be lackluster. One thread of research I am exploring is algorithms that are essentially hyper-parameter-free — all hyper-parameters would be learned by the system based on interaction with the environment. Only resource constraints would be specified as hyper-hyper-parameters, e.g. max energy usage.


Given the reasons above, our only option as of now is to build “dumb” AGI systems. I use the term “dumb” in the following sense and not related to any physical disabilities.

dumb (dəm/)
NORTH AMERICAN - informal
stupid.
“a dumb question”
synonyms:stupid, unintelligent, ignorant, dense, brainless, mindless, foolish, slow, dull, simple, empty-headed, stunned, vacuous, vapid, idiotic, half-baked, imbecilic, bovine

bovine: To find stepping stones towards human-like intelligence, we should look down and around in the evolutionary tree. Every species that has ever existed has demonstrated some aspects of general intelligence by surviving and reproducing. General intelligent agents should be tested against those capabilities demonstrated by the earliest forms of life. The day AGI systems reach bovine capabilities would be a day to celebrate!

unintelligent and slow: That’s what we can achieve today given the algorithmic and computing constraints listed above

half-baked: We are far away from production-ready AGI systems. Early AGI systems would not be scalable, nor perfectly architected. That is fine.

simple: Simplicity as in Occam’s razor.

“nature is the realisation of the simplest conceivable mathematical ideas” (Einstein, 1954).

I believe that the basic rules that give rise to intelligence would be simple, and so should the basic algorithms for AGI . They should be derivable from first principles and be understandable by an average 10 year old.


Hence, I am dedicating my time and resources to build a “dumb” AGI system. Hopefully I can contribute a few stepping stones that future researchers would utilize to reach AGI… and singularity and beyond.

[1] http://aiimpacts.org/brain-performance-in-flops/
[2] The Singularity is Near, Ray Kurzweil
[3] Hyper-parameters of an AI system dictate how the system is organized, how fast learning should proceed, etc. In contrast, parameters of the system are learned during training.