Is AI marching steadfast to human level intelligence?

Venkateshan K
The Startup
Published in
10 min readAug 27, 2019
Source: Max Pixel

Artificial intelligence (AI) has proven to be remarkably successful in a number of fields — beating the best human player of Go, identifying faces in a video, engaging in an eerily human-like conversation and generating entire paragraphs of meaningful text.

Even more remarkably, most of this development has occurred in the last few years highlighting the other aspect of AI — its exponential rate of progress, with rapid improvements in performance on specific tasks and overall explosion in the domains of its application. In fact, this pace of growth has been true more generally with the digital age but the gains seem particularly pronounced with AI.

If we were to extrapolate the exponential growth of AI capabilities, it may seem that in the not too distant future, we would have AI systems first attaining, and then surpassing, human level intelligence. Such runaway AI intelligence could possibly mean that we would have a super-intelligent AI system that controls everything in the world leaving the fate of humanity unclear.

What should we make of such prognosis? Is it true that we are very close to the threshold of breakout in human level AI or could it be possible that, despite all the impressive achievements of AI, the sort of intelligence that humans take for granted is still far out of the reach of AI technology?

To answer that question, we need to first distinguish two intertwined issues here — the first, the continuation of exponential growth in AI, and the second, the progressive emergence of artificial human level intelligence.

Exponential growth is an exception, not a rule

While the trajectory of electronics and communications technology strongly suggests that all technological progress is exponential, the reality is far less exciting. In fact, there are a lot of areas where development has been significantly slower than what one may have hoped for.

Consider the following examples of technologies that held a lot of promise at some stage but actual advances have been quite uneven: development of fusion reactors, advances in quantum computing, development of personalized medicine, space exploration,novel modes of transportation, harnessing alternative forms of energy, etc.

This isn’t to say that there hasn’t been progress in these areas or that the effort is somehow lacking or that there cannot be an exponential breakout in the future. It is merely to counterbalance the incorrect view that exponential growth characterizes all technological advancements.

In fact, the exponential progress that we see in digital technology ultimately traces to that very fortuitous scenario where every bit of extra room in silicon is being squeezed to create paths for electron flows with the result that Moore’s law — a prediction based on extrapolation, not a law in the sense of physical laws — continues to be useful heuristic more than 5 decades after it was pronounced.

Of course, Moore’s Law cannot be sustained forever and the growth will eventually plateau. The exponential progress that we have seen is not only exceptional but has a finite time window.

Narrow AI vs Artificial General Intelligence

Despite Moore’s Law, there cannot be any doubt that neural networks have proven formidable at specialized tasks and this upward trajectory is likely to continue. In the not too distant future, we will probably have autonomous vehicles on the roads, personal assistants getting more versatile, household robots taking care of the dull chores, AI systems diagnosing diseases and detecting malfunctioning organs, etc.

Nonetheless, this is quite different from general intelligence which represents a single autonomous system capable of performing a diverse range of tasks, and being able to integrate the information and experience gained along the way towards future performance. This is basically how human intelligence, and to a great extent, that of several animal taxa, functions.

The question then is, how far away is existing AI technology from achieving the goal of artificial general intelligence (AGI) ? A related question: are the current methods involving deep learning the right approach if we are seeking to develop AGI? The mere fact that deep learning is outdoing itself on specialized tasks regularly is not evidence that we are any closer to general intelligence.

To be sure, machines outperforming humans at certain tasks is hardly new. Calculators can determine the product of two 6 digit numbers faster than almost any human can. The original floppy disk, despite the limited storage by today’s standards, could very easily store 1000 numbers with 6 decimal precision, something that majority of humans struggle to do. More than two decades ago, IBM’s Deep Blue defeated Gary Kasparov in chess, one of the greatest champions of the game in what was then considered to be another human frontier being breached by computers.

Is there any reason to believe that the recent successes with deep learning is somehow fundamentally different that they hold the promise of an AGI takeoff that was earlier absent with the development of the calculator or the success of DeepBlue? Or is this another chapter in the development of AI that revolutionizes technology, society and economy but is still far away from attaining AGI?

To be able to answer that, we’ll examine the fundamental ways in which the training and computation of neural networks is different from the learning, behavior and characteristics of human cognition and intelligence.

1. Enormity of deep learning models

But before we do that, let’s just examine how much of the recent breakthroughs have been achieved by “brute-force” increase in number of computations.

Turns out the number of computations used in training recent deep learning models have outpaced even Moore’s law in terms of exponential growth as seen in the interesting post on OpenAI. The models have certainly been able to scale newer peaks in terms of performance but it appears to have been driven, at least partly, by sheer increase in the computational steps required to train them.

To get a sense of the massive architecture, the language representation model BERT uses 110 million parameters, and takes 96 hours to train on 16 Tensor Processing Units (TPU) consuming about 1500 kWh of energy in the process. That should give some hints about why it is quite dissimilar to the human brain.

There is no reason however to assume that the process of attaining AGI is similar to how intelligence evolved in humans or even that the architectures are likely to be the same, let alone energy consumption.

Yet, this is only the beginning of the divergence between deep learning approaches and intelligence that is observed in humans.

2. Large Training Data

The amount of training data needed to train typical neural networks is far greater than what a human needs to learn to perform a comparable task. The number of images that deep learning models are trained on for image recognition problems is of the order of hundreds of thousands if not more, but a five year old exposed to only a tiny fraction fraction of this, achieves equal or better results.

This again illustrates something fundamental — however clever the architecture of neural networks may be (usage of convolution filters, sustaining memory of earlier states, attention mechanism), the performance is driven primarily in a narrow brute-force manner where all the primary variations that is found in the data is encoded in the network.

3. Lack of understanding

More than anything else, the greatest limitation of neural network models is that they do not understand the world irrespective of how good their performance with classifying images or translating text may be.

For example, if we came across a video showing an unrestrained stone moving upwards in the air instead of undergoing free fall, we would think that something is odd — either the video is being played backward, or it is a trick, or maybe it is taken in an environment outside the influence of gravity. Each of these possible interpretations is based on our accumulated knowledge and experience and our disbelief of the phenomenon is almost reflexive.

Likewise, if we observed that a shattered tea-cup re-assembling to form the original object, we don’t need to know the second law of thermodynamics to determine that some trickery is afoot. Would neural networks be able to recognize this when it is analyzing videos? Imagine how one should train the neural network to be able to identify these anomalies? We can of course hard-code these are rules but that is not the same as human understanding.

3A. Inability to identify metaphors

If I told you that someone was given a blank check to pursue their interests, you don’t literally think that an empty check was handed to them (even if you’ve never heard the expression before). That’s because you immediately recognize that the sentence doesn’t make a lot of sense if we interpreted the word check literally because we understand the meaning of the rest of the sentence. Neural networks simply have no mode of doing the same.

3B. Awareness of context

Imagine looking at a photo of a living room with a television displaying a person riding a bike. Now you would think it absurd if somebody suggested that the bike is actually present in the living room. However, the neural network ( while possibly identifying the make and model of the bike) would most certainly not be able to make that distinction.

This is the basic awareness of context that is necessary for correct interpretation of visual elements based on learning the relations between different individual facts/entities to create a coherent understanding of the whole.

3C. Abstractions

In predicting when AI will be able to outdo humans at math research, the median time-frame of the responses of researches (in AI) was about 25 years.

How reliable is that estimate?

To do any mathematical reasoning or problem solving, one needs to deal very high level of abstractions. Even basic algebra requires understanding of what a variable is, conception of a set (of rational numbers, say), the notion of family of solutions, etc. (Note: none of this is to be confused with existing computational approaches like simulations or numerical approximations where we explicitly define what the solution procedures are).

However, deep learning as it exists today would have difficulty getting a grasp of even basic abstractions such as commercial vs personal transportation, the various notions of a contract, or the idea of a government or society.

Human learning is not merely statistical correlations

The primary reason for these failings is due the fact that deep learning models learn and encode statistical associations among inputs and outputs (both within and between) and this forms the basis for most of the successful performances. This also explains the need for enormous amount of data and immense parameter size because the number of such associations grows very large (think higher order correlations) with the number of features in the data. Models of shallow learning did poorly at problems like speech recognition because they were not meant to capture such a wide range of statistical associations across temporal and frequency domains.

However, statistical correlations can go only so far and that is certainly not the mechanism by which human intelligence works. Besides the well-known fact that statistical associations fail to detect any causal pattern, they are also devoid of structural understanding of the data. While convolutional filters may have greatly enhanced the detection of objects in images, it is merely a feature transformation, and enables encoding a more useful set of statistical correlations. But statistical correlations they remain nonetheless.

Although I have primarily discussed supervised learning, the same argument applies to unsupervised and other modes of applications of neural networks. The Generative Adversarial (GAN) model, one of the more interesting formulations using deep learning, is unique in many ways (much of the deep fakes phenomena owes to it), but they are not devoid of the problems of generic deep learning discussed above. Deep Reinforcement Learning, the sub-field using reinforcement technique (the algorithm in Alpha Go, for instance), sets up learning as an iterative process where decisions are taken based on projection of discounted future rewards (the projection in turn depending on the evidence observed until that point). While this procedure is likely to generalize learning in a more complex and dynamic environment it shares the failings of run-of-the-mill neural networks discussed above.

Similarly, although there has been a lot of work on transfer learning — where the models learned on a certain type of task are used to predict something different —it does little to change the fact that ultimately there is no real understanding of the input data; the reason that transfer learning works is because the features encoded in the higher layers are usually relevant for more than one task.

More to the point, despite the incredible advances of deep learning (and they truly are incredible), there is no reason to assume that we are anywhere close to being able to achieve human (or even mouse) intelligence today than we were even a couple of decades ago. There is no strong evidence to believe that incremental progress in deep learning or approaches such as GAN will lead us to AGI at some point in the future.

Is is also not quite clear that we can reproduce the capabilities of the brain even if we could mimic the connections at some level. For example, even though the number of neurons in fruit fly is around 200,000, we still have not been able to simulate something with comparable competence at variety of tasks. In reality, it is not merely the connections that matter, it is about how signals are propagated, i.e., the specific functional form of the relation between the neuronal inputs and outputs. There are (in principle) infinitely many of these, but even assuming that those are sufficiently close lead to similar qualitative characteristics still leaves too large a space to be explored exhaustively.

Hype vs Reality

Part of the problem today is that machine learning, and in particular deep learning is often portrayed as an all-conquering force that will not stop until all human activity is replaced by artificial agents and robots. This may or may not be true, but the uncritical reproduction of this view without any reflection of the ground realities of the status of the field leads to amplification of the hype, and drowning out skeptical voices. Further, the conflation of narrow AI performance with that of more ambitious but unrealized goal of developing systems capable of broad reasoning ,thinking and cognition adds to the problem.

Ultimately, it is very difficult to predict if and when humanity will be able to create artificial general intelligence. Some argue that there will be very little time between the early, poor versions of AGI and an all-powerful super-intelligent AGI. If that were to be the case, we may not be able to prepare and react in a meaningful way to its rapid emergence. It would appear advisable that we stay several steps ahead of such a potential phenomenon and be wary of the consequences of current research.

While it is important that we devote appropriate resources to better understand and prepare for that eventuality, there is equally good or even better reason for us to sit back, reflect and re-examine how much of actual progress we have made towards getting to AGI in the first place.

--

--